Interactions and Screens in Research and Education

Attentional affordances

Attentional affordances in an instrumented seminar

Mabrouka El Hachani

Jean-François Grassin

Joséphine Rémon

Caroline Vincent

Version française > Mabrouka El Hachani, Jean-François Grassin, Joséphine Rémon, Caroline Vincent, « Attentional affordances in an instrumented seminar », Interactions and Screens in Research and Education (enhanced edition), Les Ateliers de [sens public], Montreal, 2023, isbn:978-2-924925-25-6,
version:0, 11/15/2023
Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)

Through a corpus study, we approach the seminar as a dual attentional system in its material construction of space and in its relational construction. In this context, we seek to describe the process of the co-construction of attention.

Material and relational construction

In the material construction of space, objects are (re) arranged to attract attention in a body-to-body apparatus, i.e., in a mediation of bodies through artefacts and telepresence devices in a media space (Gaver 1992). The construction of this attentional set-up takes place within an exploratory space in the context of emerging technologies (telepresence robot) with non-stabilised uses. Thus, it is possible to observe an instrumentation in progress through emerging - and not yet incorporated - technological structures. We analyse, by way of illustration, how the pilot of the Beam robot finds her place through her movements, but also through the discourses regarding the position of the subject, and of the object seen as a tool for empowerment.

In its relational construction, the seminar is the locus of roles (technical assistance, orchestration of exchanges, regulationCheck “Artefacted intercorporeality, between reification and personification”.↩︎) assigned in turn by the technological device or by the subjects, with a certain expectation framework: the seminar as a social reality pre-shaped by norms, representations, and social micro rituals. We analyse how the technological apparatus modifies these expectations and habitus in the course of a construction whose moments we highlight, around an action that imposes cooperation and a cooperation that imposes action. The action of the participants is reciprocally provided by each one of them, the cooperative actions of some allowing the others to act and vice versa.

Our research question involves qualifying attention in a hybrid polyartefacted doctoral seminar. How is attention co-constructed and how do we recognise attentional phenomena in a context where their manifestations are dependent on artefaction?

Below, we discuss the concepts that have informed our approach to attention from an ecological perspective: attentional framing, the different modes of joint attention, attentional gestures and signs, the technogenesis of attention and attentional affordances.

Theoretical framework

The aim of this chapter is to observe and understand the “digital impregnation” of our attention, i.e., how the polyartefacted context recharacterises attention in a specific situation of joint and collaborative attentionFor an exploration of the collaborative processes at work, check “Research training in a polyartefacted doctoral seminar”.↩︎. The situation involves the presence of people in a remote location which makes attentional regimes more complex. Our ecological perspective on attention will be a microeconomyThat is, considered at the level of situated activity.↩︎ of joint attention (Citton 2014) including collective, artefacted and transindividual attentional regimes.

The theoretical framework of our analysis is at the crossroads of phenomenological analysis (Depraz 2014 ; Livet 2016) and theories of artefacted interactions (Arminen, Licoppe, and Spagnolli 2016). The situation we are interested in, that of a research seminar, touches on the analysis of professional situations (workplace studies) and training situations. Our approach is based on a phenomenological analysis, particularly of attention and affordances emerging in the situation, rather than on an analysis of the activity in all its dimensions.

The situation we are studying is a work meeting in which artefacts play an important organisational role. Our analysis is situated, but we believe it can be used to gain a more comprehensive understanding of a world in which we are increasingly caught up in “tightly interwoven networks of intertwined attentionsWe have translated all quotes by French speaking authors.↩︎(Citton 2014, 127) and in which artefacts are increasingly used in interactions. This use implies a growing variability in attentional capacities, attention being, moreover, “an intimate dimension of our humanity” (Depraz 2014).

Definition of attention

Attention is a relationship to objects of attention (Citton 2016) – here, those used in a doctoral seminar, to which are added, more specifically, the material and artefactual aspects of hybridity (face-to-face/remote) and the reflexive aspects of the scientific process (the seminar is the research object itself).

Attention is the result of material, social and symbolic filtering operations, a process of constituting certain objects as objects of attention. In our perspective, these objects, such as the Beam telepresence device, interest us in the attentional situation itself as objects affording the “togetherness of co-presence(Giddens 2005, 171) making a common activity possible. In our view, the uses of independent visual artefacts cannot be studied independently from the process of interaction and the working practices that make them relevant and endowed with meaning (Bonu 2007, 32).

In this sense, the perspective we adopt is resolutely ecological, supplementing a conception of attention focused on objects with a detailed attention to environments. To put it differently, the object takes on meaning for actors within a specific environment, and this meaning allows them to pay attention to it. The phenomenological viewpoint makes attention “an experience of openness to the world rather than an internal mental state” (Depraz 2014).

The artefactual situation we are dealing with does not ontologically change attention, but the multiplicity of possible targets for this attention complicates the “affective and attentional tunings” (Citton 2014) that are required for the joint activity expected in a research seminar: listening to speakers (Sessions 2 and 4), presenting research to the group (Sessions 1 and 5), and engaging in scientific discussions together (all sessions) are all activities whose scripts are relatively well known and expected by the participants.

Attentional framing

Our attentional framing is therefore a collaborative situation that involves joint attention. Natalie Depraz (2014) defines joint attention as “a structural case of relation to others via an object that is the tangible fuel of the relationship between two subjects, which builds intersubjectivity(Depraz 2014, 410).

Joint attention

Joint attention is a situation of presential co-attention “characterised by the fact that several people, aware of the presence of others, interact in real time according to what they perceive about the attention of the other participants” (Citton 2014, 127), where presence is above all temporal and sensitive (more than strictly spatial and physical). Thus, as far as joint attention is concerned, we are interested in “how a given individual is affected by his or her perception of the attentional behaviours of other individuals whose sensory presence they share” (Citton 2016, 162).

Joint attention or co-attention as defined by Yves Citton is underpinned by three principles which we examine in the rest of this chapter in order to determine the extent to which they are modified by telepresence artefacts:

  1. First of all, co-attention responds to the principle of reciprocity, i.e., a two-way flow of attention. The reciprocity of perspectives (Goffman 2013 ; Kendon 1990) involves a relationship of mutual observability which is necessary for this type of focused interaction. This requires subjects to be mutually aware of each other’s attention to the same object.

  2. The principle of “affective tuning” is one of the felicity conditions of any interaction and is composed of empathetic microgestures.

  3. Finally, the third principle noted by Citton, which ensues from the second, is the principle of improvisation, given that affective tuning cannot be routinised and depends on the feelings of each actor.

Co-attention or joint attention requires constant attentional feedback: to receive attention, one must pay attention.

The different modes of joint attention

Depraz identifies three modes of experiencing the articulation between attention and intersubjectivity, or three forms of co-attentionality that we explore here and illustrate with examples: intersubjective attention, attentional intersubjectivity and interattention.

Intersubjective attention

The first mode of co-attention is intersubjective attention, which is the broadest sense of joint attention. The focus is on the object as the basis for structuring the relationship between subjects; this object can be the situation itself, discourse and its material support. In this form, subjects have only a minimal awareness of their co-presence. Joint attention is considered minimal when the relationship between the subjects is established through the object, and interpersonal relationships remain in the background.

In the case of the seminar, for example, intersubjective attention involved collectively paying attention to the speakers and the act of looking at the videoprojected slides.

Attentional intersubjectivity

The second mode of co-attention is what Depraz (2014) calls “attentional intersubjectivity”. Attentional intersubjectivity implies that the subjects’ target is the quality of the experience; the relationship between subjects takes precedence.

The quality of the attentional experience is shared, as a common direction of interest (Depraz 2014, 410).

The object disappears in favour of a community of presence fed by the relationship to one’s self.

The attentional regime itself has become the object of focus, but it does not have the status of an external object (Depraz 2014, 410).

In the case of the seminar, this mode involved the quality of presence (including listening comfort) of the people online.


The concept of interattention articulates the concept of joint attention and that of mutual awareness, a “singular mode of attention to someone at the moment when they are paying attention to the object which I am also paying attention to at the same moment” (Depraz 2014, 402).

According to Depraz, “joint attention is not primarily joint attention to a targeted object, but shared attention whose conjunctive structure is due to relational dynamics” (Depraz 2014, 399).

The concept of interattention integrates the dimension of the quality of co-presence, underlining the correlation of the attentional movements of the two subjects and their reciprocity. In the seminar we are studying, for instance, we could pay attention to the position of the Kubi telepresence device while the Kubi pilot herself was trying to position it, and simultaneously communicate about this attempt. We will illustrate this phenomenon in the following analysis.

Joint attention as an experience is not homogeneous: it can be emotional, rational, or complex. It is a “mode of presence to” another person, a situation, an event, etc. Attentional practices depend on different confrontations with other elements involved, each creating fragility: the environment, the situation, the activity, the people, the artefacts and the documents.

Diagram of co-attention modes

The following diagram attempts to represent the modes of co-attention that we believe are relevant to the seminar.

Figure 1: co-attention patterns in the seminar

The object of joint attention is represented at the centre of the attentional set-up but does not overlap with a hypothetical centre of the technical set-up itself; interattention is represented by the arrows between the individuals; attention to the conditions of possibility of attention are represented by the arrows directed towards the outer circle, i.e., the hybrid set-up as a whole. The diagram shows the object of the seminar (e.g. a talk) as the object of joint attention, but sometimes the conditions making the interaction possible (e.g., adjusting the sound) become the object of joint attention.

Within the seminar, the modes of co-attention took place within a specific organisational space which we analyse below.

The seminar as a space for attentional organisation: attentional gestures and signs

We understand the seminar as a form of organisational framing of attention in which:

individually and collectively, individuals engage in a dynamic process of orienting attention, constructing meaning and developing appropriate responses (Rouby and Thomas 2014, 43).

Within this system of distributed attentional processing (Ocasio 2011, 1290), different types of behaviour (signal selection, interpretation, action) can occur.

Analysing the attentional frameworks we are interested in requires heuristic tools. Attention is indeed not a unitary concept but a variety of interrelated processes that we describe here. William Ocasio (2011) differentiates three forms of processes: attentional perspective, attentional engagement, and attentional selection.

Attentional perspective, individual or collective

Attentional perspective is shaped by experience and by the attentional roles assigned in the situation.

Attentional perspective

This orientation of attention implies a high level of awareness and is structurally linked to the organisation and goals of the activity.

In an interview, one of the participants explained the attentional perspective she chose (“I tend to fight to the end for the remote participants, even if it means not listening to any of the conference”), while being aware that other perspectives existed (“In our seminar, everyone has a role – we ‘wear a hat’ whether we like it or not - and Christine’s goal is to keep the seminar moving forward”).

But these perspectives are not exclusive, and their plurality was acknowledged by all the participants. For instance, Morgane explains in her interview that a negotiation takes place in a situated manner but also over time: initially, when Adobe Connect wasn’t working, Christine would say, “Well, Morgane, we’re moving on”, and Morgane wondered whether she should “give priority to technical problems” even if it meant “impacting the progress of the seminar”, or whether she should “do without” the remote participants. As the weeks went by, “the quality of the participants’ presence became a priority”, to the point of “not starting the seminar until the problem with Adobe was solved”.

Extract from Morgane’s interview

At our last seminar, some participants were attending via Adobe which wasn’t working, so Jacques Rodet couldn’t begin his talk. At this point, Christine said, “Okay, Morgane, we’re moving on”. In this situation, we wondered if we should give priority to fixing technical problems, except that technical problems can have an impact on the progress of the seminar. When issues occurred at the beginning of the experiment, we thought, “Never mind, we’ll have to continue without the remote participants”, whereas in this case, as time went on, we increasingly took into account the quality of the presence of the participants online, and it became a priority. The last seminar did not start until we managed to sort out an issue with Adobe. At the time, I really appreciated Christine’s saying, “Okay, we’ll wait a little while to solve the problem and then start the conference”.

Attentional perspective determines, among other things, whether or not the attentional markers posed by the participants are taken into account. Thus, if the participants chose to orient their attentional perspective towards technical management, then alerts of this type would be noticed and dealt with as a priority.

Attentional engagement

Attentional engagement is an intentional and sustained process of allocating attention to solve a problem and make sense of a situation.

Executive attention, attentional and inattentional blindness and attentional flexibility

This process involves attentional vigilance, which for Ocasio (2011) is the process by which individuals maintain their focus on a particular set of stimuli. The notion of vigilance, a basic form of attention, is a mode of unfocused attention: “vigilance is thus a mode of awareness, not to the object, but to the world itself” (Depraz 2016, 73).

Pierre Livet (2016) proposes a link between vigilance and negligence, which he refers to in the plural as attentional modes. Neglect is attentional because it is not due to inattention, but to attention:

One is alert to what is relevant to the task at hand, and negligent of what is not relevant to the task at hand. […] Attention is also sensitive to an “attentional set”, the set of relevant features to which the participant is willing to respond (Livet 2016, 85).

From this perspective, attentional blindness is the result of stimuli being too weak to attract attention (Kanai, Walsh, and Tseng 2010). Thus, in some cases, certain signs of presence are poorly noticed, e.g., the chat module in the video-conferencing interface was sometimes invisible to participants in the seminar room. On the contrary, inattentional blindness implies that stimuli are very noticeable and likely to attract attention, drawing attention away from other stimuli. In some other cases, participants deliberately neglected a weak sign of presence, such as a frozen image in Adobe, so that the main action could continue.

Furthermore, individuals engage differently in action, seeking stability through action routines, or on the contrary in an attentional flexibility that “refers to decentralised initiative-taking and locally invented practices in monitoring and processing indicators” (Rouby and Thomas 2014, 6), an executive attention that:

guides cognition and action when there is no predetermined pattern for achieving goals or no task requirements, or when there is a conflict between goals, such as in novel situations or non-routine activities (Ocasio 2011, 1287).

In the case of the seminar, attentional engagement manifested itself through the constantly renewed proxy of presence of one another through the available digital devices (guiding the Beam device, rotating the Kubi, activating a microphone, projecting onto the wall etc.). Attentional engagement took multiple forms, in a type of attentional flexibility, as defined by Évelyne Rouby and Catherine Thomas (2014).

Selective attention

The term selective attention refers to the process by which individuals direct information processing to a specific set of sensory stimuli at a given time.

Attentional perspective and engagement

Selective attention is determined by the attentional perspective and the result of the attentional engagement. This process is either collectively negotiated or the site of role instantiation: some participants devolved their attention to certain elements of the situation for the whole group.

The three processes described by Ocasio (2011) are among the heuristic tools that allow us to understand the attentional orchestration within the seminar and the technogenesis of attention. We explore below how the notion of affordances also contributes to this understanding.

Technogenesis of attention and attentional affordances

In the situation under study, artefacts are in the foreground and their role is crucial in terms of attentional frames.

Technogenesis of attention

The objects that humans make are never purely functional or utilitarian, but always “attention-fixing props” (Depraz 2014, 8). They articulate the link between the interaction and other situations and places; they localise the interaction by making it the focus of attention. Objects, in interaction, are both “articulators” and “locators”.

Our technical devices play a role in structuring our brains’ attentional operations, but we also select, through the interface of attention, signs from the artefacts, each according to their functionality. Attention is engaged in a recursive loop with the technical environment. Thus, “the technogenesis of attention relies as much on the conditioning of my attention by the technical devices that involve me, as on my capacity to reframe this information” (Citton 2014, 273). When we are in action in a setting, affordances are created by our activity and the surrounding world. These surroundings indicate relevance, offering affordances because of who we are and what we do and perceive from those affordances (Van Lier 2002, 150).


Theory of affordance

The theory of affordance makes it possible to study perception and action together: perception is both an invitation to action and an essential component of action. This forms a triad between the environment, the user and the activity (Van Lier 2002). In affordance theory, the active interpretation of users is central to its emergence, while reaffirming the fact that the locus of cognition, seen as distributed, is neither representation, nor action planning.

Donald A. Norman (2008) distinguishes the notion of affordance in relation to objects, versus what he calls a “social signifier”: “a signifier created or interpreted by individuals or society, signifying an appropriate social activity or behaviour”, i.e., a signal in the social world that can be symbolically interpreted and adapted for social uses. In our view, it is relevant to separate these two elements.

In our case, we cannot interpret individually the affordances of the connected objects that enable communication because we are engaged in collaborative work. The situation shapes scripts where action is collective and cognition is distributed. These scripts are largely emergent.

Furthermore, following Bruno Latour and Nicolas Guilhot (2007), we believe that “objects have the strange capacity to be both compatible with social competences at certain decisive moments, and the next moment, totally alien to the repertoire of human action” (Latour and Guilhot 2007, 284), and that the situation therefore involves a high level of uncertainty.

This has two main consequences:

These two characteristics of the affordances of telepresence artefacts and software are our focus in this chapter, specifically, their co-construction and their relationship with attention.

Corpus analysis: co-construction of the attentional set-up

Here we describe the complexity of the seminar process through the analysis of the sessions and interviews with the participants.

A complex system

The complexity of the set-up is based on several aspects that we will analyse and explain below: the multiplicity of attentional foci, the complexity of the participation framework, a deficit of perceptibility and the fact that the perspectives of each person are difficult to interchange. We will see that the bidirectional circulation of attention (reciprocal attention to others), attentional intersubjectivity, is the most challenging characteristic of joint attention in the polyartefacted set-up. The non-reciprocity of perspectives makes interactions more complex.

Multiple foci of attention

The complexity of attentional orchestration within the seminar is due, first of all, to the multiplicity of attentional foci. The participants needed to pay attention to the conditions that made the interaction possible (the technical set-up), but also to the main object of the seminar, a talk for example, and to the interactional felicity (Cosnier 2008) of each participant in the group.

“You’re frozen Tatiana”

During Session 3, Christelle was in charge of piloting the Kubi. When Dorothée, in Lyon, discusses the fact that Tatiana’s image in Adobe is frozen as an example for her demonstration, Christelle relays this in the chat window. Thus, Christelle is listening to the seminar (primary attention) but also following the focus of secondary attention, i.e., the conditions that make joint artefactual attention possible, by pointing out to Tatiana that her image is frozen.

Information relay via chat

Dorothée: “Earlier I smiled at Tatiana because I was convinced that she was smiling at me, but no, she’s been completely frozen for at least fifteen minutes.”
Christelle smiles and writes in the chat: “You’re frozen, Tatiana. Reconnect your camera”.

“Yes, go ahead and record”

In Session 5, we see the coexistence and orchestration of attentional foci at different levels: Caroline is talking in Lyon. Christelle, at home, stands up to turn on her camera. Julien goes behind the Kubi. Amélie asks if she needs to record. Christine replies, “Yes, go ahead and record” (without this becoming the primary focus of conversation). Caroline continues without pausing the conceptual discussion. Christelle opens the Kubi interface on her computer.

Focus of attention

In this example, the participants’ primary joint attention is devoted to the seminar and secondary attention to the conditions of possibility of artefactual joint attention. Another focus of meta attention is also directed to the research data collecting.

The multiplicity of focus points is combined with an audio-visual complexity which means that the participants do not have a comprehensive understanding of how the system works at any given time.

Audio-visual complexity

In Session 2, we can see an example of how complex the set-up is. The origin of the sound is difficult to identify, even for the participants themselves, who do not know which artefact is sending the sound to the remote participants, making attention allocation and interactional ratification more difficult.

“Where is the sound coming from?”

In this example, the online participants report a sound problem: they cannot hear the speakers clearly. The speakers decide to move closer to the microphone. To do so, the speakers move their table closer to the microphone they believe to be the one sending sound to the remote participants. The following exchange takes place between the face-to-face participants:

Christine: “But there’s a microphone there, isn’t there, in front?”
Dorothée (pointing to the microphone in front of the speakers): “Yes, there it is.”
Christine: “The microphone’s not working.”
Joséphine: “It’s recording, but it’s not transmitting.”
Speaker: “This is the recording microphone.”
Samira (points): “It’s this one.”
Christine: “Oh, right.”

Confusion about microphones

“Which one is the Adobe camera?”

Similarly, in Session 5, another case arises regarding the visualisation of the room in Adobe. Christelle asks the participants in Lyon to adjust the angle of vision in Adobe because she cannot see the room.

Confusion about cameras

Jean-François stands up: “Which one is the Adobe camera?
Christine: “It’s this one here.”
Jean-François: “Is this it?”
Christine: “Yes.”
Jean-François moves the camera.
Jean-François: “Is this OK? Can you see what is being drawn, er...?" Christelle:”That’s perfect.”

Jean-François asks for confirmation before changing the positioning of the camera that is transmitting the images for Adobe. The situation is not immediately clear to him.

At any given moment, the participants do not necessarily have a clear idea of how the set-up works, i.e., which artefacts are sending sound and image to the remote participants. This complicates the attentional choreographies (Jones 2004, 28) and requires the reconstruction of a collective and distributed apprehension of affordances, i.e., mutually recognised possibilities of action for each participant, which we call attentional co-affordances.

“Can you move, please?”

Morgane asks Amélie (the Beam pilot) to step back as she is in the field of the camera capturing the research data.

Limited field of view from the Beam

Morgane, speaking to Amélie: “Can you move, please? You’re in … the camera’s field.”
Amélie doesn’t hear or realise that she’s in the camera’s field.
Christine gestures with her arm towards Amélie: “Amélie, can you …, where can she go?”
Morgane: “Either to the right…”
Christine: “Well, no, because there’s a camera there.”
Amélie starts to move the Beam backwards.
Christine: “Okay, very good.”
Amélie moves the Beam forward.
Christine: “No.”
Amélie moves backwards.
Christine: “Keep moving back. There you go. Turn around now. That’s it.”

The examples taken from the corpus show the complexity of a situation where collaborative orchestration is made necessary by the absence of a comprehensive understanding of the system by each individual.

“So I kept reconnecting it”

Jean-François relates in an interview an episode that shows the complexity of the system and the intentional ambiguity that results from it:

Christelle, who had the admin rights for all the participantsChristelle had started the Adobe Connect session and could give the participants host status.↩︎, at one point sent me a little chat message saying, “I cut off your microphone because it’s echoing”, and in fact, I kept thinking that it had disconnected itself and so I kept reconnecting it.

This example also illustrates the need to make one’s intentions explicit verbally so as not to risk being counter-productive in the context of a lack of comprehensive understanding, if one wants the attentional set-up to be co-constructed.

Complexity of the participation framework

Another level of complexity is that of the participation framework, “the way in which social actors perform, in a dynamic and visible way, statuses such as speaker or recipient” (Colón de Carvajal 2014, 324; Goffman 1981). In fact, in addition to the multiple participation frameworks including the Adobe chat, the discussion in Lyon, SMS messages or e-mails, face-to-face discussion between participants seated together, etc., we can add a technical level and an affordance level. Thus, if the aim is for as many participants as possible to be ratifiedSee chapter “Theoretical and methodological framework for visual reflexive ethology”.↩︎, the same applies from a technical and affordance point of view. Each participant can be ratified from a conversational point of view but momentarily stopped from an affordance point of view, the possibility of action not being seized upon even though it is technically present. For example, Christine gives the floor to Tatiana, but Tatiana has forgotten to reactivate her microphone, so the possibility of action, which consists of transmitting the sound of her voice remotely, is not available, even though technically and socially the conditions are met.

False affordances

Hidden affordances (Gaver 1991) outline a non-participation framework or a framework of non-ratification, due to technical problems or unperceived possibilities of action. This impediment, which stems from the constitutive assymmetry of the situation, requires co-construction in order to circumvent obstacles.

Marion Luyat and Tony Regia-Corte’s definition

Marion Luyat and Tony Regia-Corte (2009, 28) describe “false affordances” as such:

A glass window without a reflection can falsely make the passage appear to be clear. On the other hand, glass doorways without glass can hinder our locomotion by making us wrongly believe that they contain glass. Quicksand is another example of a false affordance.

In our case, false affordances are those imagined about the perspective of others. For example, I may think that the Beam robot can easily be moved to follow face-to-face speech, when in fact it cannot. Or I may think that when I speak to participants using Adobe Connect, I should look at the wall where their image is projected. But this projected presence is a false affordance as you would actually have to look at the camera that is filming the room in Lyon to appear to be speaking directly to participants.

Figure 2: Susan Herring directing the robot towards the projected image of Christelle

Mutual recognition

In a situation involving a face-to-face interaction,

speakers and listeners are equally embodied actors, involved in a common activity, in which participation is accomplished in a mutually recognisable way by all participants, whether ratified or not (Colón de Carvajal 2014, 324).

The character of the mutual recognition of participation and ratification is missing in our hybrid polyartefacted situation and makes episodes of co-constructed adjustments necessary.

Lack of mutual recognition

To add to the complexity of the situation, in the presence of kinetic-audio-visual impediments, participants do not know whether they are dealing with a deliberate choice on the part of the remote participant (for example, she has momentarily turned off her microphone to speak with someone at home), non-ratification (are the participants on Adobe silent because they have not been given the floor, because they cannot hear what is being said in person, or because they have been forgotten by the participants in Lyon?) or a technical problem (for example, are the remote participants silent because there is a momentary problem with the sound in the webcast, or because they have no contribution to make at that moment?).

Examples of erroneous impressions of sound transmission and reception

Tatiana gives an example in an interview:

I had the impression that I wasn’t being heard, that I wasn’t being listened to, but that’s because I put my microphone on mute and I forgot to actually turn it back on, and I was talking and talking. It was quite funny.

In this episode, Tatiana assumes she has the attention of the participants in Lyon, but the absence of sound is due to an oversight on her behalf, and the Lyon participants could wrongly assume that she has nothing to say at that moment.

Similarly, Amélie explains how her remote presence using the Beam robot makes speaking more complex:

It seems complex and for me at the beginning I didn’t think it was going to be so complex, and I thought I could speak in turn naturally, but because I can’t use visual cues, it’s hard to say when I can speak, or else I have to wait for someone to actually indicate that it’s my turn.

In an interview, she recalls a moment when she had to take her turn by force and interrupt another participant:

In fact, in the room, as we can easily use visual cues, there are always acts of repair; in my case maybe I should have performed an act of repair for the person I interrupted.

We can see that the lack of reciprocity in the perception of the situation leads the participants to wonder how they should manage interactions with each otherCheck “New norms of politeness in digital contexts”.↩︎.

Non-reciprocity of perspectives

The main feature that is challenged in the definition of co-attention that we gave earlier is the reciprocity of perceptions.

Joint attention requires the mutually explicit perceptibility of the affordances at stake in the polyartefacted situation we are analysing, i.e., each participant must be aware of the possibilities offered to the other by the environment and vice versa. This awareness is not simple because the situation is new to the participants who are not used to all the artefacts. The co-constructed intelligibility of affordances is at the heart of the hybrid seminar as well as of our study because it makes joint attention possible. In a situation where it is difficult to put oneself in another’s shoes, especially when one has never used an artefact before (how can one know that on the Kubi, the angle of vision is reduced or needs to be adjusted if the artefact is moved manually?), each participant’s possibilities of action must be made explicit.

The non-reciprocity of perspectives constitutive of the system

The “Digital Presences” experimental set-up was constitutively dissymmetrical from the point of view of each person’s perspective. This is precisely why the attentional set-up needed to be constructed or reconstructed. This assymmetry of perspectives can be explained from the point of view of empathy.


Empathy is based on an idealisation of the reciprocity of perspectives. In this regard, Marie-Lise Brunel and Jacques Cosnier (2012, 74), with reference to Alfred Schutz (1973) and Peter Berger and Thomas Luckmann (1966, 44–45), recall the “two basic idealisations that are the very conditions of empathic functioning: the idealisation of the interchangeability of perspectives and the idealisation of the congruence of systems of relevance”.

In our set-up, there was a constitutive assymmetry due to the fact that not all the artefacts had been tested by all the participants. The non-interchangeability of points of view was as much physical as mental, as it was difficult to know what was relevant from an attentional point of view for the Beam pilot without having used the artefact oneself.


David Sirkin et al. (2011, 162) formulate this assymmetry in a video-conferencing context as follows:

Attention is fundamental to the flow of face-to-face conversations. Each participant projects cues of what they are paying attention to, and the other participants interpret these cues to maintain awareness of their attentional focus. […] Video-conferencing systems disrupt the link between attentional projection and attentional awareness. They do this, in part, because they do not faithfully reproduce the spatial characteristics of gaze, body orientation and pointing gestures (2011, 162).

According to these authors, video-conferencing introduces too many “invisible parameters”, such as the viewing angle, the size of the computer used remotely etc. These invisible parameters contribute to the non-reciprocity of views, which are not immediately interchangeable because each participant has not experienced the other’s point of view. This assymmetry appears frequently in the corpus. In the following example, the question of the angle of view threatens the continuity of the interaction.

For example, during Session 5, Christelle (the Kubi pilot) asks the participants in Lyon to reposition the camera sending the video to the remote participants because she cannot see what is happening in the room. The participants in Lyon do this only because Christelle has asked them to.

Camera on Adobe’ screen

Dissonant views

Caroline gets up to go and draw a diagram on the whiteboard.
Christelle: “It would be better if the camera was directed to the whiteboard there, the Adobe [camera], that determines what I see in Adobe, because otherwise I can just see you.”

In the interview conducted with Amélie by Dorothée and Samira, this assymmetry also appears in terms of sound. Amélie recalls a moment when she sneezed, and she did not realise that the sound effect in Lyon was dramatically amplified because of the adjustment of the microphones at that moment. This detail may seem like a non-event, but it had an impact on the group’s thought process and understanding of each other’s attentional perspectives.

Sneezing very loudly

I didn’t feel like I sneezed very loudly and that it had such a strong impact in the room, I didn’t even realise this in fact […] When I watched the video of the seminar, I realised the impact it had, but I didn’t think I sneezed loudly. I didn’t experience the impact it had in the room, in fact.

Affordance blindness

From the point of view of the attentional orchestration, this affordance blindness complicates the co-construction of the set-up. By affordance blindness, we mean the absence of awareness of a possibility of action: Amélie is not aware that the volume of the sound transmitted is high even though technically this possibility of action is available. Amélie participates in this construction later on by turning off her microphone, when necessary, as she explains in an interview: “Afterwards, this was useful for the following seminars, I turned off my microphone so that when I sneezed no one could hear me”.

This time, a strong signal was not given to her by the participants during the interaction, but by the research data, in this case the research videos she had started to watch.

View could also be affected by affordance blindness. Amélie comments on the situation when the speakers moved the table closer to the microphones (Session 2). She found herself wedged between their table and the participants’ table in Lyon behind her.

Wedged between tables

I thought, maybe the view I have … maybe they’re not that close, but frankly, you can watch the video, I feel like they’re fifty centimetres away from me.

She herself cannot rely on her perception of distance; she is both the most geographically distant participant and the closest from a proxemic point of view, a proximity that is uncomfortable for her. She is aware, at the time of the interview, which takes place one month after this session, of the non-reciprocity of these aspects.

Attentional orchestration

In view of the complexity described above, we can hypothesise that attentional orchestration is constructed by placing successive attentional markers during the interaction. This allows participants to orient the attentional selection process of the entire group, or a few members.

Placement of attentional markers

We define this marking as the production of a weak or strong attentional signal through discourse or gestures. The marker indicates “mutually explicit” affordance attention (to use Depraz’s terminology) at a given moment. This marker may or may not be taken up again spontaneously, depending on one’s role and contextual and temporal priorities. We see in the two examples below that a marker can be produced but not taken up immediately when the imperatives and constituents of the interaction conflict with the co-affordance to be constructed (for example, taking into account the poor sound quality for remote participants).

Each person has their own priorities

As Christine notes in an interview:

If they send a message through the chat and, for example, someone can solve the problem that’s fine, but I’ll still focus on the speakers because I am there to host them.

Conversely, Morgane, the main person responsible for the technical aspects, takes the opposite view:

There’s a sound problem here and I’m trying to make Dorothée understand it at first, knowing that Susan Herring is talking, and I don’t want to interrupt her. There’s a problem with the sound according to the chat I see in Adobe. In this project, I’m responsible for the technical side of things, so my priority when there’s a conference like this is to make sure that everyone can follow along and see and hear the speaker.

Thus, for Morgane, the comments in the Adobe chat are a strong signal, whereas for Christine they are a weak signal.

In Session 2, a cue is given regarding the fact that Tatiana cannot hear, following a signal taken up by Josephine from the Adobe chat projected on the wall.

Example of an attentional cue

This cue is not picked up on by Christine, after the initial adjustments are made, as her priority is to ensure that the guest speakers can give their talk, which is also being filmed in order to be posted on the IMPEC website. Moreover, the investment in human and material resources in setting up the system makes any malfunctioning problematic, as it is not a set-up that is easy to reproduce outside of a planned event established well in advance.

Conversely, during Session 5, once the attention is focused on the problem of Christelle’s view of the room, Jean-François then spontaneously checks that she can correctly see what is happening in Lyon without her reminding him.

“Can you see what is being drawn?”

Picking up on an attentional cue

Jean-François: “Is this OK? Can you see what is being drawn, er...?”
Christelle: “That’s perfect.”

In this session, there is no external guest participating in the seminar, which also gives the participants more room to rework attentional markers.

A certain fluidity can even be observed, i.e., the interaction does not stop during a readjustment of the possibilities of action. For example, during this same session, Jean-François gets up to adjust the position of the camera filming the Kubi in the room in Lyon. He does so while contributing to the interaction about the diagram being drawn on the board by Caroline.

Jean-François turns the camera as he speaks

Fluidity of adjustments

Jean-François: “In fact, I wonder if we shouldn’t be making a list of human and non-human agents that…”
(He goes back to his seat and then starts again once he’s seated)
“because I think that there are non-human agents like Adobe, Beam”…

As he speaks, he turns the webcam that is filming the Kubi.

Thus, negative affordances (Gibson 1979) for some represent the possibility for others of placing attentional markers which ultimately contribute to the co-construction of the overall attentional set-up.

An example of the attentional affordances of the set-up: projection of the screen presence of the Adobe interface

Putting people on the screen affords their presence. Projecting their image on a wall was an active element of the seminar. Their projections allowed the participants to be present to the others and was a very strong attentional focus, perceptible through gestures (pointing gestures) and verbalisations interpreting what was projected (notably when the participants’ image was frozen). This set-up provided attentional affordances, but also false affordances.

Through its salience, the projection of the Adobe screen both directed attention and projected the presence of the connected participants. But the projections also introduced false affordances, directing the gaze and speech towards images and not people.

During Susan Herring’s lecture (Session 6), Christelle was connected to Adobe and her image was projected on the wall. As Susan Herring, piloting a Beam robot, answered a question asked by Christelle, she wanted to direct her “body” towards her and, to do so, she directed the robot towards the projected image of Christelle on the wall.

Susan faces away from Christelle

Collective management of interattention

In trying to orient the robot’s interface towards Christelle’s interface, Susan Herring actually turned her “back” to Christelle’s artefactual eyes and ears: the microphone and camera, located in front of the wall. She was thus deceived by the visual presence of Christelle’s eyes and ears on the wall, a false affordance of her presence. Realising that Christelle could not hear well (Christelle frowns and approaches the screen), she then seeks to verify the reciprocity of their perception – “I don’t know if Christelle can hear me?”. The group, which picked up on the misunderstanding a few minutes earlier, then takes charge of re-establishing inter-attention to ensure the fluidity and accessibility of the interactions. The participants then tell Susan Herring, “She’s there!”, pointing to the camera-microphone facing the wall. She then realises her mistake: “Oh, I’m talking to the screen!”

This example illustrates various phenomena that we have demonstrated in this chapter: the lack of symmetry of perceptions, the collective construction of the affordance network providing the fluidity of interactions, false accessibilities to others, the need to make one’s intentions and perceptions explicit to others, and finally, the assumption of responsibility for interattention by the group. We see in the following section that the assumption of responsibility for interattention involves the construction of attentional co-affordances.

Emergence of attentional co-affordances

A trace of the construction of attentional co-affordances appears in Session 5. Christelle points out that Tatiana has posted in the chat, Morgane reads aloud what Tatiana has written, and Jean-François replies aloud to Tatiana without her having spoken. He is, however, explicitly addressing her.

Attentional co-affordances

Attentional co-affordances

Jean-François: “Yeah, Tatiana, I was also thinking of quantum physics because in terms of uh, quantum physics there is also the hypothesis of Schrödinger’s cat, which can be both dead and not dead in the box, and in our case, some people can be both present and not present.”
Christine: “Yes, that’s true, that’s interesting.”
Morgane: “Sh…something’s cat…”
Tatiana: “Schrödinger, Schrödinger’s cat.”
Jean-François: “Thank you, Tatiana”.

This exchange illustrates an example of a redefinition of mutually-explicit attention, which no longer involves gaze awareness, the awareness of the direction of the gaze, but attention awareness, the awareness of the focus of attention. Jean-Françoisco-affordance action is validated since Tatiana confirms that she has heard his proposal and the exchange that followed. Thus, the trust that Jean-François showed towards the affordance co-construction appears to be legitimate.

In this session, the last one filmed for the research corpus, mutually explicit attention is no longer based on the convergence of gazes, but on affordance co-construction. The participants make a constantly renewed and gradually less risky bet that the attentional orchestration can be built on a non-reciprocity of perspectives, affordances and gazes.

The “Digital Presences” set-up brings into play our own and others’ perceptibility. Our analysis has brought to light the collective and co-constructed understanding of potentialities in a polyartefacted situation as well as the fact that this understanding requires an attentional orchestration which is itself co-elaborated. This reciprocal accommodation in an artefacted environment leads to a distinction between awareness of the direction of each other’s gaze or gestures, and awareness of each other’s attentional focus. The hybrid artefacted environment thus leads to the exploration of other ways of enabling the projection and awareness of attention. For instance, we learned to interact in the seminar without mutually-explicit attention as defined by Depraz, all the while remaining confident in the interaction. We can hypothesise that trust in the co-affordance construct takes the place of mutually-explicit attention. This is a kind of meta-vigilance as described by Livet (2016): a vigilance to our lack of vigilance or to our attentional neglect. In our view, the effort involved in such an attentional set-up requires individuals to build a collective capable of managing the variety of our attentional neglects, and to be more sensitive to them, in order to manage collective action and participation.

This effort is a dynamic process that leads to the emergence of attentional co-affordances.

Figure 3 : Emergence of co-affordances

Attentional co-affordances enable co-presence and attentional orchestration. The perceptibility of these affordances is apprehended collaboratively through the placement of attentional markers, until mutually-explicit attention is achieved despite the assymmetry of perspectives within the media space.

Arminen, Ilkka, Christian Licoppe, and Anna Spagnolli. 2016. “Respecifying Mediated Interaction.” Research on Language and Social Interaction 49 (4): 290–309.
Berger Peter, and Luckmann Thomas. 1966. The Social Construction of Reality: A Treatise in the Sociology of Knowledge. New York: Doubleday.
Bonu, Bruno. 2007. “Connexion continue et interaction ouverte en réunion visiophonique.” Réseaux 5 (144): 25–57.
Brunel, Marie-Lise, and Jacques Cosnier. 2012. L’empathie. Un sixième sens. Lyon: Presses universitaires de Lyon.
Citton, Yves. 2014. Pour une écologie de l’attention. Paris: Éditions du Seuil.
———. 2016. “Attention collective et vigilance médiatique.” Intellectica 66 (2): 161–80.
Colón de Carvajal, Isabel. 2014. “Parler à distance par visiophone : modification du cadre participatif lors de l’intégration d’un nouveau locuteur.” In Corps en interaction : participation, spatialité, mobilité, edited by Lorenza Mondada, 323–56. Lyon: ENS Éditions.
Cosnier, Jacques. 2008. “Les gestes du dialogue.” In La communication, état des savoirs, edited by Philippe Cabin and Jean-François Dortier, 119–28. Auxerre: Éditions Sciences Humaines.
Depraz, Natalie. 2014. Attention et vigilance: à la croisée de la phénoménologie et des sciences cognitives. Épiméthée. Paris: Presses universitaires de France.
———. 2016. “Vigilance : une micro-dynamique de l’éveil.” Intellectica. Revue de l’Association pour la Recherche Cognitive 66 (2): 67–79.
Gaver, William W. 1991. “Technology Affordances.” In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems Reaching Through Technology - CHI ’91, 79–84. New Orleans, Louisiana: ACM Press.
———. 1992. “The Affordances of Media Spaces for Collaboration.” Proc. CSCW 1992, ACM Press, 17–24.
Gibson, James J. 1979. The Ecological Approach to Visual Perception. Boston: Houghton Mifflin.
Giddens, Anthony. 2005. La Constitution de la société. Paris: Presses universitaires de France.
Goffman, Erving. 1981. Forms of Talk. University of Pennsylvania Publications in Conduct and Communication. Philadelphia: University of Pennsylvania Press.
———. 2013. Comment se conduire dans les lieux publics: notes sur l’organisation sociale des rassemblements. Translated by Daniel Cefaï. Paris: Économica.,fr,4,9782717864410.cfm.
Jones, Rodney. 2004. “The Problem of Context in Computer-Mediated Communication Communication.” In Discourse and Technology: Multimodal Discourse Analysis, edited by Philip LeVine and Ron Scollon, 20–33. Washington, D.C: Georgetown University Press.
Kanai, Ryota, Vincent Walsh, and Chia-huei Tseng. 2010. “Subjective Discriminability of Invisibility: A Framework for Distinguishing Perceptual and Attentional Failures of Awareness.” Consciousness and Cognition 19 (4): 1045–57.
Kendon, Adam. 1990. Conducting Interaction: Patterns of Behavior in Focused Encounters. Studies in Interactional Sociolinguistics 7. Cambridge ; New York: Cambridge University Press.
Latour, Bruno, and Nicolas Guilhot. 2007. Changer de société, refaire de la sociologie. Paris: Éditions La Découverte.
Livet, Pierre. 2016. “Vigilances et négligences.” Intellectica. Revue de l’Association pour la Recherche Cognitive 66 (2): 81–99.
Luyat, Marion, and Tony Regia-Corte. 2009. “Les affordances : de James Jerome Gibson aux formalisations récentes du concept.” L’Année psychologique 109 (2): 297–332.
Norman, Donald A. 2008. “The Way I See It : Signifiers, Not Affordances.” Interactions 15 (6): 18–19.
Ocasio, William. 2011. “Attention to Attention.” Organization Science 22 (5): 1286–96.
Rouby, Évelyne, and Catherine Thomas. 2014. “La construction de compétences collectives en environnement complexe : une analyse en termes d’attention organisationnelle. Le cas exploratoire de la conduite d’un four de cimenterie.” @GRH 12 (3): 39–74.
Schütz, Alfred. 2014. Eléments de sociologie phénoménologique. Paris: L’Harmattan.
Schütz, Alfred, and Thomas Luckmann. 1973. The Structures of the Life-World. Second. Evanston: Northwestern University Press.
Sirkin, David, Gina Venolia, John Tang, George Robertson, Taemie Kim, Kori Inkpen, Mara Sedlins, Bongshin Lee, and Mike Sinclair. 2011. “Motion and Attention in a Kinetic Videoconferencing Proxy.” In Human-Computer InteractionINTERACT 2011, edited by Pedro Campos, Nicholas Graham, Joaquim Jorge, Nuno Nunes, Philippe Palanque, and Marco Winckler, 162–80. Lecture Notes in Computer Science. Springer Berlin Heidelberg.
Van Lier, Leo. 2002. “An Ecological-Semiotic Perspective on Language and Linguistics.” In Language Acquisition and Language Socialization: Ecological Perspectives, edited by Claire Kramsch, 140–64. Advances in Applied Linguistics. London ; New York: Continuum.