Ethnographic Eye-Tracking Interviews: Analyzing Visual Perception and Practices of Looking

Abstract: In this article, we present methods for the use of eye-tracking in interviews in order to reflect on visual perception and practices of looking as part of an ethnography of the senses. The methods are based on two multi-year ethnographic studies involving eye trackers. In the first one, researchers used mobile eye trackers to study how art museum visitors approach digital image technologies. In the other, they relied on stationary eye trackers to investigate practices on digital image platforms. We discuss how video recordings of participants' eye movements were made and describe the process of conducting ethnographic interviews based on the videos. The eye-tracking interviews can be used 1. to make participants aware of and think about practices of looking; 2. to verbalize in dialogue sensory and interpretative processes regarding museum objects and digital image technologies; and 3. to surface individuals' aesthetic preferences and incorporated knowledge.

Key words: eye tracking; interviews; ethnography; visual perception; museum; images; digital technologies

Ethnography represents a firmly established methodology in cultural anthropology (BEER & KÖNIG, 2020; BISCHOFF, OEHME-JÜNGLING & LEIMGRUBER, 2014; HESS, MOSER & SCHWERTL, 2013) and in the qualitative social sciences (BREIDENSTEIN, HIRSCHAUER, KALTHOFF & NIESWAND, 2015; KNOBLAUCH, 2005), and it also assumes an important function in other fields such as media studies. Significant debates have emerged in the past years surrounding the use of multisensory ethnography, which involves the analysis of and the reflection on sensory perception (see, for example, BENDIX, 2006; PINK, 2009). The challenges that arise in describing sensory phenomena include interpreting physical and sensory experience based on cultural categories, understanding people's modes of perception and action through their biographies, becoming aware of one's own sensory bias, accounting for the material dimension of sensory experience, and analyzing the influence of digital communication media on sensory perception (PINK, 2009, p.79). [1]

In multisensory ethnography, researchers are required to sensitize their own perceptions, align their own "sensory antennas" to the research context (BENDIX, 2006, p.81; see also WILLKOMM, 2014) and develop new methodological approaches that ensure a more comprehensive understanding of sensory perception. In the following pages, we specifically ask about the possibilities and limitations of such an ethnography in the analysis of visual perception, and discuss the potential of eye tracking (hereafter: ET) as a research method. A central challenge is how to combine the interpretative approaches of ethnography with the eye-tracking methods of cognitive science and computer science. We propose an approach in which we bring together important elements of each—and it is precisely here that its analytical value lies. [2]

As our starting point, we understand visual perception not as a cognitive process but as constituted by socio-cultural practices. We therefore speak of practices of looking, a term inspired by Marita STURKEN and Lisa CARTWRIGHT:

"Nobody is free to look as they please, not in any context. We all perform within (and against) the conventions of cultural frameworks that include nation, religion, politics, family, school, work, and health. [...] How you look can also refer to the practices in which you engage to view, understand, appreciate, and make meaning of the world. To look, in this sense, is to use your visual apparatus, which includes your eyes and hands, and also technologies like your glasses, your camera, your computer, and your phone, to engage the world through sight and image" (2018, p.1). [3]

It is also helpful to think of visual perception as skilled vision, i.e., "a social activity, a proactive engagement with the world, a realm of expertise that depends heavily on trained perception and on a structured environment" (DYER & PINK, 2015, p.9; see also GRASSENI, 2010). That is to say, viewing practices are not simply intentional actions but learned socio-cultural routines guided by an actor's incorporated knowledge. Sophia PRINZ put forward a similar argument:

"What the subject can see and how it looks at something depends on the implicit knowledge it forms in the course of its life, which ways of seeing are 'provided' by its environment, and which practices it exercises in each case" (2014, p.329).¹⁾ [4]

But what methods are suited to apprehend these practices? From an ethnographic perspective, they become particularly palpable when it is possible to articulate and compare processes of visual perception. Photographic representations and photo-elicitation interviews are one way to think with people about their sensory experiences, their categories of relevance, their value systems, their incorporated knowledge, and systems of meaning (HARPER, 2002; SAINI & SCHÄRER, 2014). Another way is offered by walking interviews that focus on visual perception (SCHWANHÄUSER, 2015, pp.87-88). But these approaches tend to take sense perception or viewing practices as jumping-off points for ethnographic reflection rather than making them the subject themselves. To do the latter, one must ask: How do we see seeing? How can ethnographers recognize and describe practices of looking? What methods can they use to capture viewing patterns? A strength of ethnographic approaches is that researchers can combine different methods and utilize different types of data. Therefore, our proposal for identifying practices of looking is to integrate ET procedures into ethnographic research designs. [5]

Below we outline the literature at the junction of ethnography and ET (Section 2), provide an overview of the project that laid the foundation for our methodology (Section 3), and describe the research design of the study in detail (Section 4). Then, in the main part of the article, we present numerous examples to demonstrate the methodological value of ethnographic ET interviews (Section 5). We conclude by summarizing our argument and identifying some problems associated with it (Section 6). [6]

ET is a process for tracking eye movements using technological aids (BOJKO, 2013). In ET systems, a reflection is generated on the cornea by means of infrared diodes and then recorded by an infrared camera. In this way, it is possible to measure direction of sight and eye movement (p.8). In information science, psychology, usability research, and numerous other fields, ET systems are used to record pupil movements when examining, processing, and evaluating knowledge resources; to assess the effectiveness and usability of digital systems, communication technologies, media content, and spatial interaction elements; and to consider modes of artistic presentation and the potential of museum arrangements in capturing visitors' attention. SYKES et al., 2010 demonstrated the effectiveness of ET in evaluating the search functions of digital libraries; BURLAMAQUI and DONG (2017) used ET methods to investigate possible correlations between eye movements and the perceived affordances of (digital) objects; and KRUG (2022, pp.233-239) embedded ET in the analysis of theater rehearsals. Finally, a number of scholars—such as EGHBAL-AZAR (2016); EGHBAL-AZAR and WIDLOK (2013); REITSTÄTTER et al. (2020); and SCHWAN, GUSSMANN, GERJETS, DRECOLL and FEIBER (2020)—have used ET to better understand the perceptions and experiences of museum visitors. [7]

Other studies that are particularly valuable for our approach are those whose authors identified possibilities at the interface of ethnography and ET. DYER and PINK (2015), for example, investigated the influence of auditory narration on the perception of visual representation in complex units of cinematic information. They described the possible links between an ethnography of the senses in which seeing is defined as a situational practice, eye-tracking studies in information science, and findings from the neurosciences. However, they discussed only in passing the methodological potential of bringing together complementary areas of expertise. [8]

SUMARTOJO, DYER, GARCÍA and CRUZ (2017) used a combination of ethnography and ET to shed light on the complex dynamics of visual perception and photographic processes in urban environments. While in one case study they evaluated only the visualization tendencies and preference patterns recorded by ET, in another they used ethnographic go-along interviews to contextualize the ET images through the situational sensory experiences and self-perceptions of the participants. The authors saw in their second study the particular potential of combining ethnographic approaches with ET methods:

"[T]he value of combining ethnography with eye-tracking is what it offers to contextualise visual perception in a wider set of material and immaterial aspects of our surroundings and how we understand them, helping to move towards a more complete picture of what we see when we look" (p.78). [9]

While the experimental research design of the study highlighted the necessity for ethnographic contextual knowledge in the evaluation of eye movements and fixation points, SUMARTOJO et al. said little about the knowledge gained from such a combination. We have taken elements of the aforementioned methods to create our own approach, which we refer to as ethnographic eye-tracking interviews (hereafter: EET interviews). Here, ET videos of eye movements are used to address physical processes of perception in ethnographic dialog with the participants. [10]

It should be noted that interview-based approaches have also been used in traditional ET. These are usually referred to as verbal protocols, and can be distinguished into concurrent verbal protocols (CVP) and retrospective verbal protocols (RVP) (BOJKO 2013, pp.108-119; VAN DEN HAAK, DE JONG & SCHELLENS, 2003).²⁾ CVPs are conducted during ET and RVPs are conducted after ET. The latter can provide some principles for EET interviews, but they are very different from ethnographic interviews in terms of the questions asked and the way they are conducted (e.g., the videos are not always viewed together with the study participants). [11]

RVPs in user experience research are used to identify and solve problems in system designs. EET interviews, by contrast, are used to describe and think about perceptual processes within the framework of comprehensive ethnographic research designs. RVPs tend to be limited to the processes documented in ET. In EET interviews, researchers attempt to establish connections between the perceptual processes in the ET videos and the life worlds of the study participants based on ethnographic research. [12]

In RVPs, separate roles for interviewers and interviewees seem to be standard, with the former acting neutrally and cautiously so as to allow the latter "a quite realistic reflection of participants' behavior" (BOJKO, 2013, p.108). With EET interviews, researchers do not aim at this type of realism. Rather, they gain ethnographic findings from dialogic interactions with participants they regard as their equals. In the following pages, these and other differences (such as gaining knowledge) will be discussed in more detail. Ultimately, however, a categorical distinction between RVPs and EET interviews is neither possible nor meaningful—both methods can complement each other productively in suitable research designs. What EET interviews and RVPs have in common is that dialogic interaction takes place after ET. This has some analytical limitations, but it also offers special opportunities for analytical reflection, which we discuss later in this article. [13]

The EET interviews were conducted as part of the research project "Curating Digital Images: Ethnographic Perspectives on the Affordances of Digital Images in Museum and Heritage Contexts" (BAREITHER et al., 2021; BAREITHER, GEIS, ULLRICH et al., 2023; ULLRICH, 2021; ULLRICH & GEIS, 2021).³⁾ We examined the interrelationships between digital image technologies and museum spaces in two fields of work. By "museum spaces," we mean both museums and memorials and the digital spaces associated with them (in particular digital image platforms, forums and social media). Museum spaces are productive fields in which to ask what exactly "happens in the eye [or] the perceiving body of the aesthetically trained actor when he or she looks at a work of art or enters a museum space" (PRINZ, 2014, p.304). The ET study was conducted by us together with Elke GREIFENEDER and Vera HILLEBRAND from the Information Science Laboratory (iLab) at the Humboldt-Universität zu Berlin, who were responsible for the technological implementation and analysis of the data. [14]

In the following pages, we focus on the integration of the ET methods into the ethnographic research designs of studies conducted by ULLRICH, GEIS, and BAREITHER. While GEIS focused intensively on the ways in which museum databases and archives are used in her sub-study, ULLRICH examined how social media might transform museum spaces (see also BUDGE, 2017). In both sub-studies, we adhered to an ethics of care in which we also took into account the specific conditions of ET technologies (KRUG & HEUSER, 2018). [15]

In our project, we aligned practices of looking with overarching practices of digital image curation (BAREITHER et al., 2021; BAREITHER, GEIS, ULLRICH et al., 2023). For us, curation seeks to establish and shape relationships between incorporated knowledge, information, experiences, feelings, meanings, and artifacts. The core argument of the project was that digital image technologies offer new possibilities for curating practices in museum spaces. Curating primarily referred not to the work of professional curators, but to the everyday practices of people who visit museum spaces equipped with digital image technologies or who receive and process images of museum artifacts on online platforms. [16]

We used the theoretical model of affordances to bring out the influence of digital technologies within image curatorial processes (BAREITHER, 2020). By affordances, we mean the possibilities and limitations of technologies to shape specific practices and experiences. The affordances of digital image technologies make certain aspects of material and digital museum spaces more visible or invisible. In this context, media scholar Judith WILLKOMM (2014) described how media technologies filter acoustic, optical and haptic sensory performances; they can supplement, expand or limit human perception. [17]

Our interest in the project, therefore, lay first in museum spaces and second in how the affordances of digital image technologies shape the curation of images. Third, we were interested in how viewing practices become an integral part of debates about image curation. In other words, we asked: To what extent does the eye co-curate? We have published substantive answers to these questions elsewhere (BAREITHER, GEIS, ULLRICH et al., 2023; in particular BAREITHER, GEIS, GREIFENEDER, HILLEBRAND & ULLRICH, 2023). In this article, that specific interest is still relevant, but it is not the focus; this is because the method of EET interviews can be used much more broadly for the ethnographic analysis of diverse practices of looking. [18]

In one of the sub-studies, we focused on social media and museum spaces. The questions were: How did museum visitors use smartphones and social media to curate their own experiences of museum spaces visually, what were their personal associations with them, and what was the role played by practices of looking in the process? To answer these questions, we used mobile ET systems that could be worn like glasses and connected a smartphone for video recording. In ET, the eye movements recorded by infrared diodes and cameras are divided into fixations and saccades (HOLMQVIST et al., 2011, pp.52f.). Fixations are areas of a spatial or digital environment that are fixated on by the eyes. These are short, but comparatively static periods of looking characterized by ocular micro-movements. When considered together, the fixation points constitute a complex visual attention network in which sensory information and environmental details are filtered and processed. Saccades, by contrast, are rapid and usually conjugate eye movements for rapid switching between fixations to form a comprehensive picture of the material space or digital environment. Mobile eye trackers can also be used to record people's field of vision. The recordings made using the mobile eye trackers showed the museum space where the participants moved in, as well as the screens of the smartphones the people were using to record digital images and videos.⁴⁾

Fig. 1: ET video with a mobile eye-tracker in the Berlinische Galerie⁵⁾ [19]

The study participants in this area of work were ten social media users based in Berlin who regularly shared digital images from museum and art contexts. The decisive factor for participation in the study was not demographics, but familiarity with practices of digital image curation in museum spaces. These people were identified via the official Instagram accounts of numerous Berlin art museums, galleries, and studios (which, as part of an active digital communication policy, often repost visitor images on their page), or via location-based hashtags such as #onlyberlin and location markers. The social media users were initially contacted via the platform's messenger services. After this initial contact, the exchange mostly shifted to more direct communication channels such as WhatsApp and Telegram, where it was possible to share voice memos, documents, images, and videos in end-to-end encryption. [20]

We then invited the study participants to meet our team in a modern art museum (the Berlinische Galerie), where they were familiarized with the technical features of the mobile eye trackers. To ensure the best possible accuracy of the recordings, the eye trackers were recalibrated before the start of each session. The calibration determined the position of the eye and the reflection of the cornea in relation to a specific point in the material space or digital environment (BAUER & STOFER, 2013, p.4). When calibrating mobile ET systems, we took into account constantly changing variables: The visual conditions of the museum environment, the display elements of the smartphones, and the movements and physical positioning of the study participants. For this purpose, the participants were given eight to ten points in the material space and digital environment to fixate on for a few seconds. Visual stimuli in different luminance conditions and distances were selected in order to document eye movements during the study. [21]

Once the experiment had been prepared and explained, we asked the study participants to visit a specific part of the permanent exhibition for around 30 minutes, and to look for images and take photos for their respective social media accounts. The recordings of the eye movements and the environment could be followed by the study leaders on an external tablet. The participants were also accompanied by Sarah ULLRICH. The notes the study leaders made about the observed photographic and curatorial practices provided further context for the subsequent ethnographic interviews (SPRADLEY, 2016 [1979]) and structured the course of the conversation. Sound recordings during the exhibition tour were deliberately omitted, as the research focus of the study was on recording visual eye movements and analyzing complex perceptual processes. [22]

During the tour of the museum, the participants were asked to follow their usual routines. It was important that they were all familiar with the practices in question, i.e., that they had visited museums frequently and taken digital pictures for their social media accounts. Of course, visiting a museum with ET glasses was not an everyday situation. We assumed that the participants in the ET study would provide a filtered version of their experiences and adapt their interactions with the museum space to the experimental situation, just as interviewees filter their experiences in their responses. In ethnography, such adaptation is not an analytical problem. Ethnographers do not gain knowledge by observing what is supposedly authentic, but by engaging in dialogic interaction with individuals. The ET videos and the subsequent interviews offer an opportunity to enter into conversation with actors about viewing practices, not to identify their genuine perceptions in a neutral way. What was decisive, therefore, was not the ET videos, but the EET interviews that followed them. ET videos were shown to the study participants in the museum. While the recordings were being played, Sarah ULLRICH conducted ethnographic interviews with the participants, who were able to reflect on their eye movements and viewing habits while watching them.⁶⁾

The EET interviews in the museum represent the core of this article and significantly shape the methodological discussion. However, a second sub-study of the project focused on the curatorial practices of users of digital image platforms and museum databases. The practices consisted essentially of viewing, and interacting with, digital user interfaces. For the EET interviews, participants were invited to the iLab for stationary ET videos. Below we refer to the sub-studies as "museum study" and "laboratory study" in order to distinguish the research settings.

Due to Covid-19, it was nearly impossible to bring the study participants (who lived all over the world) to a laboratory in Berlin. It was necessary, therefore, to work with actors whose practices and everyday lives the ethnographers were unfamiliar with. This problem reduced the knowledge gained from some of the EET interviews in this sub-study. At the same time, this limitation indicates that the combination of contextual ethnographic knowledge with ET methods is productive. Without contextual ethnographic knowledge, the added value of the interviews on ET videos remains limited. [25]

In contrast to the ET videos in our museum study, the stationary ET videos in the laboratory study are much more precise. Unlike mobile ET systems, stationary data acquisition programs do not face divergent environmental conditions or abrupt movements. Stationary systems are usually operated in controlled environments, which makes it easier to minimize interference from variable lighting conditions, reflections, vibrations, or other external factors during the calibration process. Accordingly, the recorded ET data proved much more accurate. With the help of the eye trackers, the finest nuances of fixations and saccades could be determined and rapid search movements of the eyes could be traced. For the ET videos, the participants developed specific curatorial objectives with the ethnographer which corresponded to their own everyday curatorial practices. Arne and Noah⁷⁾, for example, used image platforms in their everyday lives to create illustrations and designs in fantasy card games and role-playing games, and searched for suitable motifs during the ET videos. Sofia, by contrast, set herself the goal of using the image platforms to find suitable inspiration for her work as an illustrator. Mina scoured an image platform in search of material for her dissertation on traditional Korean clothing (hanbok). The objectives of the actors in this sub-study, therefore, varied greatly. [26]

In total, both sub-studies included 20 EET interviews (ten per sub-study), though the museum study also included ten detailed ethnographic pre-interviews (each lasting approximately one hour). The 20 EET interviews each comprised the actual ET videos, which lasted 20-40 minutes a piece (apart from a few shorter recordings in the laboratory), and the EET interviews, which took place directly afterwards, each lasting 20-70 minutes. (The longer EET interviews of 50-70 minutes mostly took place in the museum.) The EET interviews were embedded in two detailed ethnographic studies, which included numerous other interviews, forms of digital and non-digital participant observation, and the evaluation of online sources (BAREITHER, GEIS, ULLRICH et al., 2023; ULLRICH, 2024). [27]

The EET interviews were analyzed following the principles of grounded theory (GLASER & STRAUSS, 2006 [1967]) and computer-assisted ethnographic data analysis (BAREITHER, 2023). The project team used MAXQDA software (RÄDIKER & KUCKARTZ, 2019). In the ET videos, the frequencies and patterns in the eye movements of the study participants were graphically processed. Both the eye movements and the transcripts of the ethnographic interviews were coded in an open process. In the second step, relevant passages from the data sets were linked and cross-coded. Below we do not discuss the evaluation procedures, but the actual EET interviews and their methodological value. The focus is on examples from the museum study, which we supplemented with selected insights from the laboratory study. [28]

The ET videos were initially used as a specific form of visual memory aid or mnemonic device. The participants in the museum study could no longer remember many art objects shortly after the museum visit, even paintings or sculptures that they had viewed at length or photographed. The ET videos gave them the opportunity to comprehend their own modes of perception and interaction and to reflect on multi-sensory impressions that would otherwise have remained unnoticed. [29]

Cora HAMILTON, a photographer from London, has been living and working in Berlin since 2019. She is the co-founder and creative director of one of the first modeling agencies exclusively for queer people in Germany. In a sequence from the EET interview, Cora could recall a strong visual impression only when talking to the interviewer. She thought about a moment in the ET video that showed her looking at a work of art for the second time. The work in question was a portrait of a young woman dressed entirely in black. She held a cigarette in her right hand and a single white flower in her left. The objects in the room behind her were only vaguely recognizable. The outline of a building that seemed to be a modern train station hall appeared in muted tones of blue and green. When asked about what she was thinking as she looked at the painting, Cora said: "I looked at this one again. Yeah right, because you could almost feel that room. I imagined it to have that cool, detached atmosphere because of the blue and green tones in the person's skin" (EET-Interview Cora HAMILTON, October 12, 2021). [30]

As this brief example shows, the videos of the eye movements made it possible to address processes that participants were not aware of during the museum visits and that would have gone unspoken without a visual cue. In order to support the study participants in verbalizing these processes, the interviewer highlighted specific eye movements. For example, the interviewer examined the ET videos with the participants and checked whether they had focused on certain paintings or details for unusually long periods; whether numerous points of fixation could be observed in rapid succession; whether certain works had been ignored; and whether connections could be established between the eye movements in the ET videos and the contextual ethnographic knowledge based, say, on participants' pre-interviews or social media accounts. Participants were encouraged to contextualize, classify, and categorize the eye movements visible in the videos. [31]

For example, the EET video of Amanda COULSON-DRASNER, who works as a presenter and social media editor at the international broadcasting service Deutsche Welle, showed her looking closely at a glass display case with many smaller sculptures. The eye movements created numerous saccades on the EET video, which made the gaze appear hectic or excited. Amanda could then be seen pulling out her smartphone to take a picture. The composition required for the photo was complicated, as another room could be seen behind the glass display case and the small sculptures were difficult to distinguish from the background.

Amanda: "Oh, it's crazy. I'm thinking like eight million things. I was trying to find a way to take a picture of the glass case so that you could see the artworks behind it. Because I really like the reflections. It's so interesting to watch my eyes while I'm taking a picture. Actually, that's so crazy."

Sarah: "Yeah, because I think you can really see how you were like looking at the pictures behind that glass thing and that object. [...] You're now trying to find the right perspective to capture that?"

Amanda: "Yeah. [...] Because here I feel like those are hard with the light, because there's kind of a bright light on the ceiling, so like it's not, I feel like with the... the atmosphere makes me dizzy [laughs]. I feel like with a camera, you could get like the sculpture without the bright light in the background, but with the phone, it's not really possible. So it kind of makes the picture not look so great in the end" (EET interview with Amanda COULSON-DRASNER, September 14, 2021). [33]

On the audio track of the EET interview, one can hear how Amanda thinks aloud while watching the recording. The examination of her own visual perception shed light on the complexity of the practices of looking. The verbalization of thoughts and the selection processes also underwent reflection in the laboratory study. For example, a participant we call Noah, an avid fan of the role-playing game Dungeons and Dragons⁸⁾, regularly searched the internet for suitable images to create sessions of the game. When searching the Europeana image database (which contains numerous digitized images and objects from European museums and cultural heritage institutions), he looked for visual material for the role-playing game. During the ET video, the ethnographer noticed how Noah looked at two almost identical photos of a castle, repeatedly switching back and forth between the tabs in the browser before closing one. [34]

When asked about these deliberate eye movements in the following EET interview, Noah explained the thought process behind his decision. The images were used not only as an inspiration for a session of the game Dungeons and Dragons but also to visualize certain plot elements within the game's story. The images were thus of great importance. Even small details that did not fit the imagined story could have significantly influenced the gameplay. The ethnographer and Noah rewatched the scene together. Pointing to his eye movements, Noah explained the decisive moment: On the right-hand edge of one of the photos, difficult to see at first glance, was a parking lot with a car. As he told the interviewer:

"That's when the decision was made. And that was then this okay, it's also about immersion ... A car would just be—it doesn't work in my imagination of a high fantasy setting [a specific type of role-playing game scenario in which there is no modern technology]" (EET interview with Noah, November 10, 2021).

Without the ET video, the decision-making process would most likely have gone unnoticed, as it was not commented on at the time of the event. What the examples have in common is that they show how EET interviews can fundamentally help to recall and verbalize viewing practices. Of course, the verbalization in EET interviews, like any verbalization of experiences in qualitative interviews, is not an authentic reproduction of what was experienced. It is an interpretation of visual perception mutually shaped by the interviewer and the participant. However, this interpretation is much closer to the actual visual perception process than interviews which have to manage without such a memory aid, and the participants can use the video to lead them from one moment to the next. [36]

But what does recalling and reflecting on viewing practices make visible? One participant in the museum study was Tayla CAMP, an American living in Germany, whose Instagram handle is taylacampcurates. In her social media posts, Tayla stages scenes in museums and combines them with fashionable style elements. Bringing together artistic and physical forms of expression, her online media profile is characterized by dark color nuances and gloomy motifs. Tayla is always in search of interesting and controversial stories, most of which cannot be found in what is obviously beautiful. In the ET video of Tayla, we saw how she first photographed and then looked at the painting Jenny Seated by Rudolf SCHLICHTER. The interviewer asked about this moment in the EET interview:

Tayla: "I just liked it. [Tayla and Sarah laugh.] Just like it stood out to me, I guess."

Tayla: "Yeah, I mean, she just looks like someone I would probably be friends with. Even though it's a century between us" (EET-Interview with Tayla CAMP, October 10, 2021).

The painting obviously corresponded with the interviewee's aesthetic preferences. However, Tayla had difficulty explaining why she found the painting appealing. Drawing on her ethnographic knowledge from other interviews and the analysis of social media content, the interviewer pointed out an aesthetic connection between the image, Tayla's eye movements and her social media account. At a methodological level, this offered an interpretation that could explain why the gaze lingered on this particular image. Tayla accepted the suggestion, but she added her own interpretation by describing an imagined emotional connection to the person portrayed in the picture ("someone I would probably be friends with"). [38]

This excerpt already indicates how reflecting on one's own visual perception allows the articulation and description of interpretative processes. Museum objects, art in particular, enable sensory experiences, emotional relationships and aesthetic practices in which their meaning, value and significance form. These instances of how the physical senses interpret, understand, and show empathy with the world were also evident in the ET videos of our study. As the study participants thought about the video sequences together with the interviewer, they were able to verbalize personal meanings and make connections between media experiences. [39]

In another sequence of the ET video, Tayla looked at the painting Lying Nude by Lesser URY. The artwork depicts a nude female figure on velvet red sheets, bathed in warm, diffused light. The dark colors and intense reds, the subtle merging of light and shadow and the intimate depiction of the human body created a mood of gloom and melancholy. On the ET videos, Tayla could be seen fixing her eyes several times on the details of the oil painting and trying to capture the entirety of the compositional design from different perspectives and angles. The Lying Nude was one of the few works of art that Tayla photographed during her visit. In the subsequent interview, she verbalized her thoughts and feelings about the painting, providing context for her own eye movements and interpretative practices:

Tayla: "Reclining Nude. She's so sad. And you know, you see a lot of reclining nudes in the history of art. Normally they're looking either indifferent or they're looking indolent or they're looking, like, seductively. But she is just, like, obviously very upset and we don't know why. Hmm. So I was really interested to learn more about her."

Sarah: "Yeah. So is it, like, is it also sometimes when you can, I don't know, relate with something you see in art that it's, like, more interesting?"

Tayla: "Yeah. Well, there are ... I'm just, like, there is ... There's a specific reason that the artist chose to portray her like this. So I took a picture [laughs]. Yeah. I was just like: What's her deal?" (EET interview with Tayla CAMP, October 10, 2021) [40]

At such moments, the EET interview becomes a place in which how the senses interpret art can be articulated and discussed. Tayla's commentary not only illustrates her interpretation process, but also her practices of looking. [41]

It is worth noting here that the actual time that Tayla spent looking at the painting was relatively short. Of course, what constitutes a "short" or "long" amount of time is relative and depends on expectations of what is normal. At any rate, the EET video showed the following sequence: First, Tayla walked past the painting, glanced at it and read the accompanying text panel indicating the title of the painting. After looking at the painting for about five seconds, Tayla pulled out her smartphone, looked at the painting through the smartphone screen for another three seconds, took the photo, and then looked at the painting with the naked eye again for six to seven seconds before moving on. Considering that this sequence lasted about 15 seconds, Tayla's explanations in the EET interview seem astonishingly detailed. [42]

Similar observations were also made in other EET interviews. The study participant Ulla SCHARFENBERG is a freelance seminar leader and copy-editor. She uses Instagram as a political forum and uses her posts to raise awareness for socio-political issues such as transphobia, right-wing extremism, sexualized violence, and discrimination. On social media, she also talks about her depression, about eating disorders, about body neutrality, and about toxic compliments. The visual components with which Ulla conveys these topics in digital space are equally diverse and multifaceted: Self-designed illustrations, creative representations of feminist literature, selfies, and images of North Sea beaches, street art, dried flowers, and antidepressants. For Ulla, digital images are a means to an end for her online activism; that is to say, it is not the aesthetics of social media but the topics and objectives of her activist work that shape her engagement with art. [43]

This attitude was also reflected in Ulla's ET video. Whenever an object in the exhibition corresponded with Ulla's areas of interest online and were thus attributable to her personally relevant contexts of meaning, a short-term stabilization of the eye movements was visible. As with Tayla, Ulla's gaze did not linger on the paintings for very long. For example, she looked at the painting Jenny Seated (see Section 5.2) for about 15 seconds, during which she glanced at the text panel. However, in the EET interview, she reflected in detail about this moment. The painting shows a person sitting topless who could be read as female but whose breasts are without nipples. Ulla said the following about her eye movements:

"Yes, I was also interested in the figure because it is probably a female figure. But because she has no nipples, I also had an association with trans people at that moment [...]. When the mastectomy is done, the nipples are missing at first and are then put back on in the second operation or something, or even partially tattooed or something, I don't know. And that's what ... that's how I tried to categorize it, because it doesn't fit [into stereotypical images of male and female bodies] and probably isn't even the point. [...] And then I briefly thought about what was depicted [...]. Yes, because the underwear is also somehow ... it was probably like that at the time, but it's not just ordinary pants that the person is wearing. So all in all, I was kind of interested ... in gender and the representation of gender, I would say" (EET interview with Ulla SCHARFENBERG, September 14, 2021). [44]

Interview excerpts like this one show what was negotiated by the study participants in their viewing practices. Just as in Tayla's example, it is astonishing how determined and detailed the sensory-interpretative engagement appears with viewing times that last only seconds. In our laboratory study, we were able to observe similar processes in Mina. As part of her dissertation project on traditional Korean clothing (hanbok), she frequently scoured image databases for digital images of textile objects or photographs of people wearing hanbok. During the ET video, she searched the database of the National Museum of Korea and came across a large selection of objects and photographs. Through her work, she had acquired an incorporated knowledge that enabled her to quickly select individual images from the database's large amount of portrayals. In many places, just a quick glance was enough for her to recognize when one of the countless images was significant for her research. At such moments, she quickly opened another tab in the browser to enlarge the image and take a closer look. She described the process of instantly deciding which ones were worthy of closer inspection as her "natural behavior." Referring to a picture of a white undershirt, she explained in the EET interview: "I immediately knew that I wanted to download it because it clearly shows the fastening." When asked by the ethnographer whether this was of interest for her doctoral project, she expanded on her idea:

"Yes, I was trying to ... This is not so related to my PhD, but I was more thinking about this haptic Hanbok that I am trying to make with the physicist and engineer for an exhibition. And we're on phase two trying to figure out how we can improve the experience of this haptic Hanbok. We realized that the contact points are very important. So, like, the vibration motors, how can the fastening of a clothing can enhance the contact points? And here you can see that it's a skirt and you tie it around your chest and this ribbon is probably going to tighten up your chest and the skirt, so it's really stuck and has many pleats, and this will make it voluminous, so it won't stick around your leg, for example. So, it means if we add vibration motors around the chest area because of the ribbon [...] you will feel more vibration" (EET interview with Mina, October 13, 2021). [45]

Mina quickly judged the object interesting enough to open it in a new tab. In the subsequent viewing, which only lasted a few seconds, she discussed her thoughts and saved the image on the computer. One way to interpret the discrepancy between detailed interpretation and short viewing time is to assume that Tayla, Ulla, Arne, and Mina only arrived at their interpretations in the interviews in an effort to retroactively enhance their engagement with the artwork. [46]

Another, more obvious interpretation is that, for experienced museum visitors and users of image platforms, the combination of ET videos and interviews leads to a casual-seeming engagement that is in reality a very complex process of sensory-interpretative engagement. Through practices of looking, they quickly seem to grasp and process which images and works are aesthetically pleasing, raise interesting questions, or make connections to political debates, scientific discussions, and their own curatorial routines. From this perspective, retrospective interpretations in EET interviews are not a potential source of distortion. Rather, the ET videos have the ability to capture the processes of interpretation and negotiation that often remain invisible in visual perception and surface them for reflection in dialogic EET interviews. [47]

As is already clear from the previous examples, viewing practices follow incorporated knowledge and aesthetic preferences. For our research, it was particularly relevant that interviewees' perceptual routines and art reception patterns were shaped by their incorporated knowledge of the affordances of digital image technologies. This emerged most clearly in the museum study. Through repeated and everyday engagement with digitally produced patterns of aesthetic classification and taste, the participants had internalized ways of perceiving and appraising that corresponded with the affordances of these platforms (see also PASSMANN & SCHUBERT, 2020). In other words, what the participants in the museum study saw and how they evaluated what they saw was determined by the affordances, attention economies, and representational possibilities of social media. [48]

Tayla CAMP described the connections between social media and her aesthetic tastes in conversation with the ethnographer before her EET videos were recorded:

"I can tell which paintings will work for my account pretty quickly now [...]. I collect works that lean towards the dark, a little bit melancholic or macabre or something like that. And lots of still lifes do this. There are lots of them, like the memento mori type paintings, so anything that kind of looks like that. I don't know. I just have this thing in my brain where I know what works for me when I'm walking past an artwork [...]. I don't know if you can quantify that" (pre-interview with Tayla CAMP, September 1, 2021, via Zoom). [49]

Amanda COULSON-DRASNER, who uses various photographic perspectives, design tools, and creative formats for her Instagram posts, also described how image-centric practices changed her view of the material environment:

It's repetition. You just know what's going to come out well in a frame. Just by doing it, thousands and thousands and thousands of times. It definitely influenced how I look at stuff [...] (pre-interview with Amanda COULSON-DRASNER, September 2, 2021, via Zoom). [50]

Incorporated knowledge and aesthetic preferences shape practices of looking in museums. The consequences of these connections can be made visible in a targeted manner with the help of EET interviews. In our study, this concerned not only moments in which study participants looked at artworks for extended periods of time, but also moments in which they averted their gaze or only glanced over a piece. For example, Tayla commented on a sequence in her EET video in which her eyes quickly scanned the museum: "Yeah, here I'm just scoping out the gallery to see which ones I like the most and which ones I'm going to actually spend time on" (EET interview with Tayla CAMP, October 10, 2021). The ET videos of the artist Zoe MILLER were also interspersed with fleeting eye movements. Zoe reflected on one particular sequence in her EET interview:

Sarah: "You also seem to judge the paintings relatively quickly. Is this simply because you've seen everything too often or in a similar form?"

Zoe: "Yes, it was a period and a selection of subjects that don't interest me. Landscapes, interiors without people—that wouldn't suit my style and my reputation as an artist in the media" (EET interview with Zoe MILLER, September 1, 2021). [51]

The works of art that Tayla and Zoe did not recognize as a source for everyday forms of aesthetic expression and personal categories of relevance were blocked out by selective routines of perception. They quickly, implicitly, and naturally recognized those artworks that harmonized with their aesthetic preferences and online presence—and ignored the others. [52]

As can be seen here, the EET interviews form a starting point for reflecting on the role of aesthetic preferences and incorporated knowledge. Consider the EET interview with Wiebke FEUERSENGER who works for Deutsche Welle as a producer, podcast presenter, and social media editor. The EET video showed her looking at a room filled with sculptures, scanning the objects, and leaving the room again without taking a photo. While watching the video, the interviewer and Wiebke first talked about the room and the sculptures. The interviewer then asked about possible connections between her aesthetic preferences and her viewing practices:

Sarah: "Because we said: Okay, keep Instagram in mind and think a little bit about what could work with your account. Have you been looking for anything like that?"

Wiebke: "Now that I've thought about it, I mean, I wouldn't normally think about it. But when I take photos now, I make sure that if it's vertical, and you think about the fact that for Instagram you have a square image and that's why you don't always frame an image because it looks good but because you know that you're taking it only for a certain section [...] Yeah, I don't know, you pre-filter" (EET interview with Wiebke FEUERSENGER, September 25, 2021). [53]

Wiebke's statement that she "wouldn't normally think about it" is relevant here from a methodological perspective. The ET video enables mental and linguistic reflection in which actors become aware of how their incorporated knowledge shapes how they see. As Wiebke put it later in the interview: "You already make the cut you need in your head, so to speak" (EET interview with Wiebke FEUERSENGER, September 25, 2021). [54]

Comparable processes were found in the laboratory study with Arne. As with Noah (Section 5.1), Arne likes to spend his free time designing a fantasy game. In his case, it is the card game Magic: The Gathering, in which players can create their own individualized deck of playing cards. Arne's curatorial objective in the lab was to find suitable motifs for a deck from the National Museum of Korea. Arne had already spoken to the ethnographer in advance about their shared interest in East Asian art, with Painting of a Falcon and the Sunrise being a particular favorite. The painting was in the museum database, and Arne enthusiastically saved it on his computer, describing it as a perfect card motif. During the ET video, Arne accidentally came across a similar painting of a bird sitting on branches. To the astonishment of the ethnographer, however, he looked at the image only briefly before focusing his attention elsewhere. [55]

In the EET interview, the researcher addressed this surprising selection and inquired about the reason for ignoring the painting. Arne explained:

"I could have actually used the [bird], but I thought it lacked context and I thought what can it do, there's no real action, there's just too little, somehow. It's hard to describe, I just think that there has to be a setting where I immediately think of something, or that I feel like writing a card text about, so to speak, or ascribing abilities to [the motif]. If I can't find anything that I can attribute to the creature at first glance, then it'll probably be a boring card and I skip it" (EET interview with Arne, November 1, 2021). [56]

This excerpt from the laboratory study clearly shows how, in the blink of an eye, it is possible to negotiate which images are suitable for one's own curatorial objectives and which are not. While from the perspective of the ethnographer, the second painting seemed entirely suitable for the fantasy game, Arne's gaze was diverted from the picture by his incorporated knowledge of the visual peculiarities of the card game and his own aesthetic preferences. [57]

Arne also noticed this dynamic when looking at the ET images. At another point in the interview, he said that his visual attention was focused more on illustrations and less on the images of three-dimensional objects in the database. "As you've just seen, I focus entirely on [illustrations] of human things or landscapes and don't look at the three-dimensional objects at all ... because I've filtered them out" (EET interview with Arne, November 1, 2021). This impression was also confirmed by the heatmaps created after the EET interviews (fig. 8). Heatmaps take data collected on a person's eye movements, pupil reactions, and average fixation duration and convert them into a two-dimensional visualization. Green dots on heatmaps indicate short dwell times; yellow to red dots indicate longer dwell times. The heatmap of the moment discussed in the interview shows how Arne's gaze only glanced at many of the objects or skipped them completely.

These cursory observations were enough for Arne to judge which images seemed unsuitable for his playing card designs. When asked, he explained that photographs of three-dimensional objects were not relevant because their "aesthetics weren't right" and he wanted to "maintain a certain style prevalent in the game's decks" (EET interview with Arne, November 1, 2021). Not only his personal tastes, but also his incorporated knowledge of the design of Magic: The Gathering significantly influenced his viewing behavior. [59]

In summary, the examples from the museum study and the laboratory study demonstrate how the visual examination of one's own eye movements recorded in ET videos could, with the support of interviewers, initiate a process in which study participants reflect in detail on their incorporated knowledge and relate it to the affordances of social media platforms. Ignoring something belongs to the practices of looking just as much as focusing one's attention on it and this reveals much about the participants' perception. [60]

Before summarizing the potential of EET interviews once again, we want to identify some limitations and potential problems in our method. One set of problems were due to technical issues. Some of the participants wore glasses, which interfered with the registration of eye movements. In the museum study, some eye-tracker videos were shifted to the left or right in participants without glasses, despite careful calibration. In these cases, it was possible only to guess which painting details, spatial arrangements, or text passages our study participants had focused on, as the display of fixations and saccades was shifted. [61]

Another problem was the limited comparability of the participants and their practices. Our participants had different backgrounds: Some were already familiar with the spaces and curatorial strategies of the museum, while for others they were completely new. The degree of familiarity with the museum spaces and works of art shaped the patterns of perception. In addition, some participants spoke only English and were unable to decipher the information in German-language texts, which in turn could have influenced their perception of the works. In the laboratory study, the participants received different tasks, which they approached on the basis of their inherent prerequisites and context knowledge. However, this problem had little bearing on our study because its aim was not a strict comparison between participant viewing practices but a dense ethnographic description of these practices in the context of digital image curation. In this respect, it was more important to see how the viewing practices related to the other curation processes in which the participant was involved. This question could be effectively addressed by the EET interviews. [62]

Overall, our study is an example of the analytical potential of ET methods in ethnographic research. Specifically, EET interviews bring three advantages. First, they are an opportunity for researchers and participants to recall processes of visual perception and to reflect on them in detail. Because the interviews took place immediately after the ET videos, the interviewers had no preparation time and had to ask spontaneous questions or raise relevant points ad hoc, and participants' memories of their own viewing practices were still present. [63]

Second, the interviews' emphasis on reflection provided a way to make visible the sensory interpretation of the artworks in viewing practices. This aspect was specific to our research where the engagement with museum objects on an epistemic, sensory, and emotional level was part of the participants' established routines. In other fields suitable for EET interviews—say, ethnographic city walks—the interpretative element may be less pronounced (but memories of one's own personal experiences could play a comparable role). The methodological point here is different: EET interviews can demonstrate how intense a visual confrontation with a material object or space can be, even if it lasts only a few seconds. They make it possible to show in sensory, epistemic, and emotional terms what is negotiated by actors in such moments of viewing. The interviews make it possible to gather rich ethnographic data from brief moments of viewing and to appraise the quality of visual perception and its connections to superordinate practices (such as digital image curation). [64]

Third, we have shown the potential of EET interviews to shed light on incorporated knowledge and aesthetic preferences. In a general sense, this is nothing new. Ethnographers have long used the potential of interviews to understand precisely these dimensions of everyday practices. What is new about our method is that the reflection in EET interviews can be directed towards those bodies of knowledge and aesthetic preferences which accompany and guide everyday viewing practices. This makes an important dimension of visual perception intelligible for ethnographers. [65]

But does the inclusion of ET systems in ethnographic research actually enable us to see how we see? This question must be answered in the negative in the sense that as a socio-cultural practice seeing can itself never be grasped in its entirety. The videos displaying lines of sight and the images of fixations and saccades are merely mediatized representations of eye movements. These representations alone cannot reflect everything that happens in moments of looking and in the sensory negotiations of individuals. What these representations can be, however, are productive points of reflection for the dialogic examination of visual perception. Even if they do not allow us to see how we see, they make practices of looking tangible. And it is here that their potential lies for multi-method ethnographic research. [66]

2) They can also be distinguished into concurrent think-aloud (CTA) and retrospective think-aloud (RTA) approaches (LUND, 2016, p.607). <back>

3) The research was funded by the Deutsche Forschungsgemeinschaft (DFG) [German Research Foundation], project number 421299207. <back>

4) The mobile ET system used in the study was the Pupil Invisible ET system from Pupil Labs. Simple glass frames could be adjusted to the head shape of the study participants while ensuring the stability of the miniaturized cameras and integrated sensors. The ET hardware unit for recording eye movements and the environment established a connection to an Android companion device on which the Invisible Companion app had been installed to capture and transmit recordings. The captured data was also stored in the Pupil Cloud for further visualization, processing and analysis. <back>

5) All participants shown in the figures agreed in writing to the publication of the photographs. <back>

6) The EET interviews in the museum overlap in methodology with the approach of Ausstellungsinterviewrundgang (AIR) [interviews during visits of an exhibition] developed by Luise REITSTÄTTER and Martina FINEDER. Here, "the focused interview is combined with the thinking aloud method while walking through the exhibition" (2021, §14) to achieve multi-sensory engagement with the objects. In contrast to AIR, our EET interviews were focused on visual sensory perception and verbal reflection on multisensory experience. The reflection did not take place when viewing the objects, but during the subsequent observation of eye movements in the ET videos. The use of an ethnographic research design is a specific feature of our EET museum interviews. Nevertheless, EET interviews in museum spaces and AIR seem to us to be very compatible. <back>

7) The names of the study participants were pseudonymized by default. However, some participants (especially those whose curatorial activities are publicly visible on social media) asked to be identified by their real names. We complied with these requests below. <back>

8) Dungeons and Dragons is a (non-digital) role-playing game in which several players follow a narrated story together and interact with each other. A game master prepares supporting materials such as pictures for the game sessions. <back>

Bareither, Christoph (2020). Affordanz. In Timo Heimerdinger & Markus Tauschek (Eds.), Kulturtheoretisch argumentieren. Ein Arbeitsbuch (pp.32-55). Münster: Waxmann.

Bareither, Christoph (2023). Computergestützte ethnografische Datenanalyse (CEDA): Potenziale und methodische Affordanzen von QDA-Software in der ethnografischen Forschung. Hamburger Journal für Kulturanthropologie, 16, 47-65.

Bareither, Christoph; Geis, Katharina; Greifeneder, Elke; Hillebrand, Vera & Ullrich, Sarah (2023). Das Auge kuratiert mit: Praktiken des Anschauens als Teil des digitalen Bildkuratierens. In Christoph Bareither, Katharina Geis, Sarah Ullrich, Sharon Macdonald, Elke Greifeneder & Vera Hillebrand (Ed.), Digitales Bildkuratieren (pp.66-83). Hildesheim: Georg Olms Verlag, https://epub.ub.uni-muenchen.de/95774/ [Accessed: April 11, 2024]

Bareither, Christoph; Geis, Katharina; Ullrich, Sarah; Macdonald, Sharon; Greifeneder, Elke & Hillebrand, Vera (Eds.) (2023). Digitales Bildkuratieren. Hildesheim: Georg Olms Verlag, https://epub.ub.uni-muenchen.de/95774/ [Accessed: April 11, 2024]

Bareither, Christoph; Macdonald, Sharon; Greifeneder, Elke; Geis, Katharina; Ullrich, Sarah & Hillebrand, Vera (2021). Curating digital images: ethnographic perspectives on the affordances of digital images in museum and heritage contexts. International Journal for Digital Art History, 8(1), 82-99.

Beer, Bettina & König, Anika (Eds.) (2020). Methoden ethnologischer Feldforschung. Berlin: Reimer.

Bendix, Regina (2006). Was über das Auge hinausgeht: Zur Rolle der Sinne in der ethnographischen Forschung. Schweizerisches Archiv für Volkskunde, 102, 71-84.

Bischoff, Christine; Oehme-Jüngling, Karoline & Leimgruber, Walter (Eds.) (2014). Methoden der Kulturanthropologie. Bern: Haupt UTB.

Bojko, Aga (2013). Eye tracking the user experience: A practical guide to research. Brooklyn, NY: Rosenfeld Media.

Breidenstein, Georg; Hirschauer, Stefan; Kalthoff, Herbert & Nieswand, Boris (2015). Ethnografie: Die Praxis der Feldforschung (2nd rev. ed.). Konstanz: UVK.

Budge, Kylie (2017). Objects in focus: Museum visitors and Instagram. Curator: The Museum Journal, 60(1), 67-85.

Burlamaqui, Leonardo & Dong, Andy (2017). Eye gaze experiment into the recognition of intended affordances. ASME Conference Proceedings, 1-9.

Eghbal-Azar, Kira (2016). Affordances, appropriation and experience in museum exhibitions: Visitors' (eye) movement patterns and the influence of digital guides. Dissertation, ethnology, University of Cologne, Germany, https://kups.ub.uni-koeln.de/7606/ [Accessed: November 20, 2023].

Eghbal-Azar, Kira & Widlok, Thomas (2013). Potentials and limitations of mobile eye tracking in visitor studies. Social Science Computer Review, 31(1), 103-118.

Glaser, Barney G. & Strauss, Anselm L. (2006 [1967]). The discovery of grounded theory: Strategies of qualitative research. New Brunswick: AldineTransaction.

Grasseni, Cristina (Ed.) (2010). Skilled visions: Between apprenticeship and standards. Oxford: Berghahn.

Harper, Douglas (2002). Talking about pictures: A case for photo elicitation. Visual Studies, 17(1), 13-26.

Hess, Sabine; Moser, Johannes & Schwertl, Maria (Eds.) (2013). Europäisch-ethnologisches Forschen: Neue Methoden und Konzepte. Berlin: Reimer.

Holmqvist, Kenneth; Nyström, Marcus; Andersson, Richard; Dewhurst, Richard; Jarodzka, Halszka & Van de Weijer, Joost (2011). Eye tracking: A comprehensive guide to methods and measures. Oxford: Oxford University Press.

Krug, Maximilian (2022). Gleichzeitigkeit in der Interaktion: Strukturelle (In)Kompatibilität bei Multiaktivitäten in Theaterproben. Berlin: De Gruyter.

Krug, Maximilian & Heuser, Svenja (2018). Ethik im Feld: Forschungspraxis in audiovisuellen Studien. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 19(3), Art. 8, https://doi.org/10.17169/fqs-19.3.3103 [Accessed: November 20, 2023].

Lund, Haakon (2016). Eye tracking in library and information science: A literature review. Library Hi Tech, 34(4), 585-614.

Paßmann, Johannes & Schubert, Cornelius (2020). Liking as taste making: Social media practices as generators of aesthetic valuation and distinction. New Media & Society, 23(10), 2947-2963.

Prinz, Sophia (2014). Die Praxis des Sehens: über das Zusammenspiel von Körpern, Artefakten und visueller Ordnung. Bielefeld: transcript.

Rädiker, Stefan & Kuckartz, Udo (2019). Analyse qualitativer Daten mit MAXQDA. Wiesbaden: Springer VS.

Reitstätter, Luise & Fineder, Martina (2021). Der Ausstellungsinterviewrundgang (AIR) als Methode. Experimentelles Forschen mit Objekten am Beispiel der Wahrnehmung von Commons-Logiken. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 22(1), Art. 6, https://doi.org/10.17169/fqs-22.1.3438 [Accessed: November 20, 2023].

Reitstätter, Luise; Brinkmann, Hanna; Santini, Thiago; Specker, Eva; Dare, Zoya; Bakondi, Flora; Miscená, Anna; Kasneci, Enkelejda; Leder, Helmut & Rosenberg, Raphael (2020). The display makes a difference: A mobile eye tracking study on the perception of art before and after a museum's rearrangement. Journal of Eye Movement Research, 13(2), https://doi.org/10.16910/jemr.13.2.6 [Accessed: November 20, 2023].

Saini, Pierrine & Schärer, Thomas (2014). Erinnerung, Film- und Fotoeliciation. In Christine Bischoff, Karoline Oehme-Jüngling & Walter Leimgruber (Eds.), Methoden der Kulturanthropologie (pp.313-330). Bern: Haupt UTB.

Schwan, Stephan; Gussmann, Melissa; Gerjets, Peter; Drecoll, Axel & Feiber, Albert (2020). Distribution of attention in a gallery segment on the national socialists' führer cult: Diving deeper into visitors' cognitive exhibition experiences using mobile eye tracking. Museum Management and Curatorship, 35(1), 71-88.

Schwanhäuser, Anja (2015). Herumhängen: Stadtforschung aus der Subkultur. Zeitschrift für Volkskunde, 111, 76-93.

Spradley, James P. (2016 [1979]). The ethnographic interview. Long Grove, IL: Waveland Press.

Sturken, Marita & Cartwright, Lisa (2018). Practices of looking: An introduction to visual culture. Oxford: Oxford University Press.

Sumartojo, Shanti; Dyer, Adrian; García, Jair & Cruz, Edgar Gómez (2017). Ethnography through the digital eye: What do we see when we look?. In Edgar Gómez Cruz, Shanti Sumartojo & Sarah Pink (Eds.), Refiguring techniques in digital visual research (pp.67-80). Cham: Springer.

Sykes, Jonathan; Dobreva, Milena; Birrell, Duncan; McCulloch, Emma; Ruthven, Ian; Ünal, Yurdagül & Feliciati, Pierluigi (2010). A new focus on end users: Eye-tracking analysis for digital libraries. In David Hutchison, Takeo Kanade, Josef Kittler, Jon M. Kleinberg, Friedemann Mattern, John C. Mitchell, Moni Naor, Oscar Nierstrasz, C. Pandu Rangan, Bernhard Steffen, Demetri Terzopoulos, Doug Tygar, Moshe Y. Vardi, Gerhard Weikum, Mounia, Lalmas, Joemon Jose, Andreas Rauber, Fabrizio Sebastiani & Ingo Frommholz (Eds.), Research and Advanced Technology for Digital Libraries (pp.510-513). Berlin: Springer.

Ullrich, Sarah (2021). Digitales Kuratieren der Kunst- und Museumserfahrung auf Social-Media-Plattformen. In Ulrich Hägele & Judith Schühle (Eds.), SnAppShots. Smartphones als Kamera (Vol. 14, pp.67-77). Münster: Waxmann.

Ullrich, Sarah (2024). Social-Media und Museum: Wie digitale Bilder und ästhetische Praktiken die Kunsterfahrung verändern. Bielefeld: transcript.

Ullrich, Sarah & Geis, Katharina (2021). Between the extraordinary and the everyday: How Instagram's digital infrastructure affords the (re)contextualization of art-related photographs. Art Style | Art & Culture International Magazine, 7(1), 117-133.

Van Den Haak, Maaike; De Jong, Menno & Schellens, Peter Jan (2003). Retrospective vs. concurrent think-aloud protocols: Testing the usability of an online library catalogue. Behaviour & Information Technology, 22(5), 339-351.

Willkomm, Judith (2014). Mediatisierte Sinne und die Eigensinnigkeit der Medien. In Lydia Maria Arantes & Elisa Rieger (Eds.), Ethnographien der Sinne: Wahrnehmung und Methode in empirisch-kulturwissenschaftlichen Forschungen (pp.39-56). Bielefeld: transcript.

Christoph BAREITHER is professor of cultural anthropology and digital anthropology at the University of Tübingen. In his work, he focuses on the ethnographic study of digital everyday cultures. His aim is to shed light on the transformations of everyday practices and experiences in the course of digitalization—social media, digital image technologies, computer games, machine learning—and to contribute thereby to pressing socio-political debates.

Ludwig Uhland Institute of Historical and Cultural Anthropology
University of Tübingen
Burgsteige 11, 72070 Tübingen, Germany

Sarah ULLRICH is a postdoctoral researcher at the Ludwig Uhland Institute of Historical and Cultural Anthropology at the University of Tübingen. Previously, she was a PhD researcher in the DFG-funded project "Curating Digital Images: Ethnographic Perspectives on the Affordances of Digital Images in Heritage and Museum Contexts" at the Ludwig Uhland Institute and the Institute for European Ethnology at the Humboldt University of Berlin. In her research, she has focused on digital aestheticization at the junction of museum spaces and image-centric social media platforms. In her current project, "Blickwinkel: Eine kooperative Projektinitiative zu kreativer Kunstvermittlung und digitaler Teilhabe," funded by an innovation grant from the University of Tübingen, she transfers the findings of her ethnographic work to museum practice.

Ludwig Uhland Institute of Historical and Cultural Anthropology
University of Tübingen
Burgsteige 11, 72070 Tübingen, Germany

Katharina GEIS is a doctoral student at the Institute for European Ethnology at the Humboldt University of Berlin. She was a PhD researcher in the DFG-funded project "Curating Digital Images: Ethnographic Perspectives on the Affordances of Digital Images in Heritage and Museum Contexts" at the Centre for Anthropological Research on Museums and Heritage. In her research, she focuses on digital museum work and the users of digital collection databases. In her dissertation, she examines how digital images from online museum databases are used in everyday life and curated on social media, and which modes of knowledge are created in the process.

Bareither, Christoph; Ullrich, Sarah & Geis, Katharina (2024). Ethnographic eye-tracking interviews: Analyzing visual perception and practices of looking [66 paragraphs]. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 25(2), Art. 9, https://doi.org/10.17169/fqs-25.2.4165.

Forum Qualitative Sozialforschung / Forum: Qualitative Social Research (FQS)

ISSN 1438-5627

Funded by the KOALA project

Creative Commons Attribution 4.0 International License