## Abstract

In this paper we present our findings from a lab and a field study investigating how passers-by notice the interactivity of public displays. We designed an interactive installation that uses visual feedback to the incidental movements of passersby to communicate its interactivity. The lab study reveals: (1) Mirrored user silhouettes and images are more effective than avatar-like representations. (2) It takes time to notice the interactivity (approximately 1.2s). In the field study, three displays were installed during three weeks in shop windows, and data about 502 interaction sessions were collected. Our observations show: (1) Significantly more passers-by interact when immediately showing the mirrored user image (+90%) or silhouette (+47%) compared to a traditional attract sequence with call-to-action. (2) Passers-by often notice interactivity late and have to walk back to interact (the landing effect). (3) If somebody is already interacting, others begin interaction behind the ones already interacting, forming multiple rows (the honeypot effect). Our findings can be used to design public display applications and shop windows that more effectively communicate interactivity to passers-by.

 J. Müller, R. Walter, G. Bailly, M. Nischt, and F. Alt, “Looking Glass: A Field Study on Noticing Interactivity of a Shop Window,” in Proceedings of the 2012 ACM Conference on Human Factors in Computing Systems, New York, NY, USA, 2012, pp. 297-306. [DOI] [PDF]

## Introduction

A major challenge when creating engaging public displays is that people passing by are usually not aware of any interactive capabilities. Unlike privately owned devices, such as mobile phones or PCs, people simply do not know or expect that they are interactive – an effect that has been amplified by displays having been used for static ads from their very advent. If public displays cannot communicate their interactivity, they will be hardly used and not fulfill their purpose. We believe that this issues will become even more apparent in the future as current LCD technology for public displays are likely to be replaced by technologies that more closely resemble traditional paper (e.g., e-paper [5]). As a consequence, passers-by might not even be able to notice that a surface is digital, unless the content is constantly moving. To put this problem in context, passers-by of public displays need to (1) notice the display, (2) understand that it is interactive, and (3) be motivated to interact with it (not necessarily in this order). Mueller et al. [19] discuss the role of attention and motivation. However, relatively little is known about understanding interactivity (2), which is at the focus of this paper. Previous solutions involve calls-to-action and attract loops [14]. A call-to-action, like a “Touch to start” label, can be effective. However, text or symbols are language and culture dependent and complex to understand subconsciously. Attract loops, such as a video of a person interacting, can create an atmosphere of an arcade game and be complex to understand in a similar manner. In this paper we investigate how feedback to the passer-by’s incidental movements (e.g., a mirror image) can be used to communicate the interactivity of a display (see Figure 1). As humans are very efficient at recognizing human motion [2] as well as their own mirror image [18], this technique benefits from these perceptual mechanisms. After discussing psychological foundations, we report and discuss the results of a lab and a field study. In the initial lab study we were able to show that a real-time video image or silhouette of the user are equally effective for recognizing interactivity. Avatar-like and more abstract representations are less effective. We measured an average time of 1.2s people required to recognize interactivity for the mirrored video. In the subsequent field study we deployed and tested three displays in a shop over the course of three weeks. Our observations show: (1) Significantly more passers-by interact when immediately showing the mirrored user image (90% more) or silhouette (47% more) compared to a traditional attract sequence with call-to-action. (2) Passers-by often recognize interactivity after they already passed. Hence they have to walk back – we call this the landing effect. (3) Often passers-by notice interactivity because of somebody already interacting. They position themselves in a way that allows them to see both the user and the display, allowing them to understand the interactivity. If they start interacting themselves, they do so behind the person interacting hence forming multiple rows. Our observations can be useful for designers of public displays who need to communicate interactivity to passers-by, and more generally, for any designer of devices where users do not know in advance that the device is interactive.

## Studies

To explore how inadvertent interaction and representations of the user can be used to communicate the interactivity of public displays, we conducted a series of three user studies. We developed a series of prototypes that were successively refined based on the results of these studies. During these studies the focus was on noticing interactivity rather than attention or motivation. We simply relied on the motion of the user representation to capture attention and on a very simple game (playing with balls) to motivate users. More elaborate attention grabbing or motivating techniques would probably increase the total number and duration of interactions.

## Mirror, Silhouette and Call-to-Action

### Recommendations

Inadvertent interaction outperforms the attract loop with call-to-action in attracting interactions. The Mirror representation also outperforms the Silhouette and interaction without user representation. In contrast to the lab study, the Mirror representation works significantly better than the Silhouette. From this we learn that Mirror representations is a powerful cue to communicate interactivity, although Silhouettes may have some benefits such as more artistic freedom in designing the content and provide more anonymity. As most people recognize themselves on the display rather than someone else, displays should be positioned so that people can see themselves well when passing by. Over time, as knowledge about the interactive device builds up, these interactivity cues become less important.

The total number of interactions during the 11 coded days is shown in Table~\ref{tbl:results}. We compared the number of interactions per day. An ANOVA reveals a significant effect for interactivity cue (call-to-action vs. inadvertent interaction) (F(1,11)=12.6 p<.001). A post-hoc Tukey test shows that passersby interact more with the inadvertent interaction condition than with the call-for-action. The ANOVA also reveals a significant effect for user representation (F(2,22)=13.1). A post-hoc Tukey test shows that {Mirror is more efficient than Silhouette and No-Representation. Finally, the ANOVA also reveals a user representation * interactivity cue interaction (F(2,22)=6.8, p<0.005). As expected, there are no significant differences between the user representations for call-to-action. User representations differ only in the inadvertent interaction condition. Many interactions with the display only last for seconds. The interviews revealed different preferences for the user representations. The shop owner prefers the Silhouette as people are covered in company colors. There is no clear user preference, and many say that they like the representation they discovered first. Users who prefer the Mirror representation describe it as more authentic’, more fun’, and they like to see themselves and their friends. Users who prefer the Silhouette representation described it as more anonymous’ and said that they like it when bystanders can not see their image. Some also say that they do not like to see themselves and prefer the Silhouette representation. In the Image representation, also some users mention that they do not like to be observed by a camera, which they do not say about the Silhouette representation. From our observations, we found that in the call-to-action conditions, people spend several seconds in front of the display before following the instructions (Step Close to Play’). In this vignette, two women observe the display for some time, before one of them steps closer and activates the interaction in the Mirror condition. They are surprised to see themselves and walk away. A few meters further, they notice a second display running the inadvertent interaction Silhouette condition, where they start to play. When interviewed on how they noticed interactivity, most people say that they saw themselves on the display. Some also say that they saw themselves and a friend / partner at the same time. Only very few stated to have seen the representation of another person walking in front of them. When a crowd had gathered around the display, it was sometimes very difficult to distinguish who caused which effect. This was especially true for the Silhouette and obviously the No-Representation conditions. In these cases we observed people copying the movements of other users and seemingly interacting with the screen, even though they are not represented on the screen. Sometimes they are not even standing in the field of view of the camera. This can be an example of {\em overattribution}, where people assume they are causing some effects although they are not. Over time, knowledge about the presence built up and interactivity had built up among people who pass the location regularly. In the third week of deployment, a number of people who interacted said that they had seen somebody else interacting, e.g., a few weeks ago’ or earlier that day’, but had not tried to interact themselves. There were also a few regular players. For example, we noticed from the logs that between 7-8 am, there was considerable activity in front of the displays. Observations revealed that a number of children played regularly with the displays on their way to school. We observed them waiting expectantly at the traffic light, then crossing the street directly to the display to play with it. Such interaction is obviously different from situations where people encounter the displays for the first time.

## The Landing Effect

### Recommendations

The landing effect is in line with our observation from the lab. People need approximately 1.2s (Mirror) and 1.6s (Silhouette) to recognize interactivity. They also need to notice the display first and be motivated to interact. With an average walking speed of 1.4\,m/s, by the time passersby have decided to interact, they already passed the display. This effect is so strong that it should be designed for in any public display installation. Displays should be placed so that, when people decide to interact, they are still in front of the display and do not have to walk back. Optimally, when users stop friends walking in front of them, the friends should also still be able to interact with the display without walking back. This could be achieved by designing very wide displays (several meters), or more practically, a series of displays along the same trajectory. Another solution would be to place displays in a way so that users walk directly towards them, but this is only possible for very few shop windows.

## Dynamics Between Groups

### Recommendations

The honeypot effect is a very powerful cue to attract attention and communicate interactivity. Displays which manage to attract many people interacting will be able to attract more and more people. The honeypot effect works even after multiple days, as people who have seen somebody interacting previously may also try the interaction in the future (see Section~\ref{sec:mirror-silhouette-call} Mirror, Silhouette, and Call-to-Action). To achieve this, displays should be designed to have someone visibly interacting with them as often as possible. This can be achieved by increasing the motivation and persuasion for people to play longer. Because the audience reposition themselves so that they can see both the user and the display, the environment needs to be designed to support this. In our case, both the subway entrance and the narrow sidewalk limited the possible size of the audience. In order to support more audience, displays should be visible from a wide angle, or considerable space should be available directly in front of the displays. This is also necessary as different groups start to interact behind each other. This interaction behind each other should also be supported, e.g., by increasing the maximum interaction distance beyond the distance from where single groups normally interact.

We observed many situations in which different groups started to interact. The first group (or person) usually causes what has been previously termed the `honeypot effect’. We found that people passing by first observed somebody making unconventional movements while looking into a shop window (the manipulation). They subsequently positioned themselves in a way that allowed them to see and understand the reason for these movements — usually in a location that allowed both the persons interacting as well as the display to be seen (Figure~\ref{fig:repositioning}). In this figure, a man interacting with the display with expressive gestures attracts considerable attention. The crowd stopped and stares at him and the display and ends up partially blocking the way for other passersby. Newcomers seem to be first attracted by the crowd, then follow their gaze, then see the man interacting, follow his gaze and eventually reposition themselves so they can see both the man and the display. They also seem to prefer to stand a bit to the side, so that they are not represented on the screen. The audience is mostly positioned behind the user. We observed this pattern regularly. When people in the audience decided to join the interaction, they accordingly did so {\em behind} the ones already interacting and not next to them (Figure~\ref{fig:multiple_rows}). In this figure, the little girl in the front noticed the interactivity first, followed by her mother, who then stopped to explore the display together with the daughter (the father did not walk back and is standing behind the camera). The young woman behind them was attracted by their interaction and eventually also interacted behind them. This again attracted the couple behind them, and the girl finally also started interacting in a third row. In some cases, such multiple rows form again from people observing at the subway entrance. In the few cases where other people started to interact in the same row as people already interacting, we were able to observe social interaction between the users, which we did not observe for different groups interacting behind each other. People interacting with the screens were usually standing in the way of others. The resulting conflicts were solved in different ways. For the screen installed near the subway entrance, passersby usually tried to pass behind the ones already interacting, to not disturbing them. When multiple rows of people interacted, this was not possible however, and they passed in front of them (Figure~\ref{fig:multiple_rows}). When a large group passed by, we sometimes observed that the person interacting abandoned the display. This again sometimes let someone from the coming group take the place and start playing. We also saw some occasions, where users deliberately moved between the display and the person interacting and interacted for a very short moment.

## Dynamics Within Groups

### Recommendations

The most important observation from this section is that very few persons who are alone, interact. This observation is supported by the results of the pre-study. Therefore it is important to understand how groups notice interactivity, and public displays should always be designed to support groups. Even if just one person is interacting, the display must provide some value for the other group members. When users strongly engage with their representation on the screen, they may forget about their real surroundings. According to our observations, slow moving objects make users move slower, which increases safety.

We discovered that the vast majority of interactions are from people traveling in a group. The only cases of single people interacting we observed personally are the children before or after school hours, men waiting for a considerable amount of time near the subway entrance, a man in rags, and a man filming himself while playing. One man for example waited for several minutes directly in front of one screen, while incidentally interacting with it through his movements. After some time, he was approached by an apparent stranger, who showed him the display and the fact that he was interacting. The man seemed surprised, and continued to play a little bit with the display. While a considerable number of single people pass by the store, they usually walk faster and look more straight ahead and downwards. When we interviewed some of them, only very few had noticed the screens at all, and nobody had noticed that the screens were interactive. Between one and five persons interacted simultaneously (avg.=1.5). Often, the first person in a group noticed the display first, but this was not always the case. We discovered that people strongly engage with the game and apparently identify more with their representation on the screen than the possible influence of their movements on people around them (see Section~\ref{sec:psychological-cues}). This sometimes leads to situations where people are not aware anymore of their neighbors (people belonging to one group usually line up next to each other), even though they are able to see their representation on the screen. This focus on the virtual space leads in some situations to that people accidentally hit or bump into each other. Another observation was that people usually start interaction with very subtle movements and continuously increase the expressiveness of their movements. This process sometimes takes just a few seconds and sometimes extends over many minutes. The subtle movements at the beginning are sometimes just slight movements of the head or the food. Later, people proceed with extensive gestures using both arms, jumping, and even acrobatic movements like high kicks with the legs.

### Images

[slideshow post_id=”338″]

## Related Publications

 J. Müller, R. Walter, G. Bailly, M. Nischt, and F. Alt, “Looking Glass: A Field Study on Noticing Interactivity of a Shop Window,” in Proceedings of the 2012 ACM Conference on Human Factors in Computing Systems, New York, NY, USA, 2012, pp. 297-306. [DOI] [PDF] J. Müller, R. Walter, G. Bailly, M. Nischt, and F. Alt, “Looking Glass: A Field Study on Noticing Interactivity of a Shop Window (Video),” in Adjunct proceedings of the 2012 acm conference on human factors in computing systems, New York, NY, USA, 2012, pp. 297-306. [DOI]