Qual Quant DOI 10.1007/s11135-015-0273-2
Open narrative questions in PC and smartphones: is the device playing a role? Melanie Revilla1 • Carlos Ochoa2
Springer Science+Business Media Dordrecht 2015
Abstract Most survey questions are closed questions, where respondents have to select an answer from a proposed set of alternatives. However, a lot of surveys also include, at least occasionally, some open questions. Open questions that call for elaborated and developed answers, called ‘‘open narrative questions’’, are used when the researchers want to go deeper into what the respondents think. This paper compares the answers to open narrative questions when the respondent is participating in a PC survey, in a smartphonenot-optimised survey or in a smartphone-optimised survey. The experiment was carried out in Spain using data collected by the Netquest online access panel. Respondents were assigned randomly to each type of device and survey format, in two successive waves. Because respondents have to type in their answer, we expect differences between devices, linked with the size and the kind of keyboards (i.e. physical versus digital, touch-screen or not). Differences are observed between answers that come from PCs and smartphones for the response time per written character, for the number of total characters and for the use of abbreviations, but not for the non-answer and non-substantive responses. No differences are observed between optimised and not optimised versions for smartphones, except for the response time per character written. Keywords Web surveys Smartphones Mobile optimised questionnaires Open narrative questions Response time
& Melanie Revilla
[email protected];
[email protected] Carlos Ochoa
[email protected] 1
RECSM-Universitat Pompeu Fabra, Edifici Merce` Rodorera 24, Office 24.406, Ramon Trias Fargas 25-27, 08005 Barcelona, Spain
2
Netquest, Gran Capita´n 2-4, 08034 Barcelona, Spain
123
M. Revilla, C. Ochoa
1 Introduction Most survey questions are closed questions, where respondents have to select an answer from a proposed set of alternatives. Closed questions have the advantage of being easier to analyse (it is not necessary to code them) and may require a lower effort from the respondents, since possible answers are already suggested to them. However, a lot of surveys also include, at least occasionally, some open questions, i.e. questions in which the respondents can give the answer they want, without being provided with a list of options. Couper et al. (2011) distinguish open questions that call for short answers from open questions that call for more elaborated and developed answers. The latter are called ‘‘open narrative questions’’. Open questions that call for short answers are used when too many possible answers exist, so it would be very complicated to propose an exhaustive list of choices (for instance, to ask about the favourite actor’s name or favourite brand of beer) or when the researcher is interested in spontaneous knowledge or preferences of the respondents: which brands come first to respondents’ mind, which products respondents have at the top of their mind, etc. Open narrative questions are used when the researchers want to go deeper into what the respondents think: for instance, explaining why they are in favour of a law, or of a candidate. Then, respondents have the possibility to give their opinion in a more detailed way, to nuance it. Thus, the information can be richer for researchers. Besides, the answering process of open narrative questions may look more like a usual conversation, offering respondents a higher freedom to express themselves. Hence, these open narrative questions are important. That is why, in this paper, we will focus on this kind of questions. When using open narrative questions in data collection modes where an interviewer is present (face-to-face, telephone), respondents do not get clues about the expected length of the answer, except maybe from interviewers attitudes. On the contrary, in online surveys, the size of the textbox where respondents have to type in their answer seems to be used by the respondents as an indicator of the expected answer length. Indeed, several previous studies (Christian and Dillman 2004; Israel 2010; Smyth et al. 2009) indicate that a larger textbox produces longer answers. However, there are also studies that found the contrary (Zuell et al. 2015). In addition, other elements, in particular using a counter indicating how many characters are left, can also increase the average number of written characters in the answers (Emde and Fuchs 2012), but it does not affect the non-response rate, nor the average number of topics mentioned by the respondents (Emde and Fuchs 2013). While a lot of research has been done to compare answers to open questions in online surveys with answers to open questions in other modes of data collection (e.g. paper-andpencil, face-to-face, telephone), most of the research was about web surveys completed through personal computers (PCs), either desktops or laptops. However, in the last years, the phenomenon of unintended mobile respondents (Peterson 2012; de Bruijne and Wijnant 2014a, b) has appeared: even if not planned by the researchers and even if the surveys are designed for PCs and not adapted for mobile browsers, more and more respondents answer web surveys through tablets and smartphones. This phenomenon has been recently observed in many different countries (Wells et al. 2013; Revilla et al., forthcoming). This phenomenon is important because mobile devices differ from traditional PCs at different levels. There are also big differences between tablets and smartphones. This paper focuses on comparing smartphones and PCs, first, because the expected differences between these two devices are the biggest. Second, Revilla et al. (forthcoming) showed that in the country of these analyses, Spain, smartphones are currently more used than tablets to
123
Open narrative questions in PC and smartphones: is the…
answer surveys and would be preferred in the future by more respondents if all surveys would be adapted to mobile devices. Compared to PCs, smartphones are usually much smaller, they have touchscreens and in most cases virtual keyboards, people carry them around most of the time and use them in all kinds of places. All these differences can influence survey responses as well as the data quality. As noted by Lambert and Miller (2015, p. 167): ‘‘While smartphones and tablets offer the convenience of internet access virtually anywhere, the touch screen functioning, truncated viewing area and smaller keyboard layout make them more conducive to certain activities (such as checking email and watching funny cat videos on YouTube) but less conducive to others (such as selecting radio buttons from a large item matrix or typing in extensive and detailed responses to open-ended prompts)’’. When completing a questionnaire, depending on the format of the question, more or less differences can be expected across device. Typing answers to open questions is one of the tasks where larger differences are expected, because of the small keyboard size which makes it more difficult to input text. Kaikkonen (2009) argues that mobile web users only answer to urgent emails from their phones and as briefly as possible, and prefer as often as possible to use a PC with traditional keyboard for writing longer emails. In this paper, we use data from an experiment implemented in Spain at the beginning of 2015 to compare answers provided through PC and smartphones to open narrative questions in an online survey. First, Sect. 2 presents the main results from previous research, followed by the experimental design, hypotheses and data used to test these hypotheses. Then, Sect. 3 provides the main results. Finally, Sect. 4 concludes.
2 Methodology 2.1 Previous research about open questions in mobile devices versus PCs One of the first experiments about open-ended questions in smartphones is reported by Peytchev and Hill (2010). They argue that ‘‘keyboards in mobile web surveys may have a unique impact on responding to survey questions, compared to other computer-administered modes’’ (p. 331) and that ‘‘because the task of text input is more difficult [through mobile devices], some respondents may not provide text information, and even select response options specifically to avoid typing. This is a source of measurement error and data quality that is rather unique to mobile devices and requires further investigation’’ (p. 330). They study this by implementing an experiment where participants are all provided with identical non-touchscreen smartphones. Then, respondents are assigned either to a closed question or to a half-open question (‘‘Other, please specify’’). They found that in the half-open format, more respondents do not answer or select an unlikely response. They interpret this as a way to avoid typing text. The same experiment was replicated by Wells et al. (2014) and by de Bruijne and Wijnant (2014a, b). However, in both cases, they did not found significant differences between the half-open and closed conditions. This can be related with the quite different conditions of the study: more recent context, higher familiarity of the respondents with the smartphones (since most respondents use their own smartphone), and mainly touchscreen smartphones in the two replications. As a consequence, the results of the two replications suggest a different conclusion than the initial study, which is that the reluctance of mobile
123
M. Revilla, C. Ochoa
respondents to type in short open-ended responses is not so high that they select unlikely responses to avoid typing. Other studies compare open-ended questions in mobile devices and PCs, looking at different elements, mainly: item non-response, non-substantive response, quality or precision of the answers measured by the number of characters written or by the content of the answers (number of topics/ideas provided), and time of completion. Concerning item non-response, it is usually assumed that differences in text entry methods reduce the level of completion for open-ended questions when answered through mobile phones. Empirical support for higher non-response for mobile phones respondents is found by Mavletova (2013) and Lambert and Miller (2015). However, Zahariev et al. (2009) and Toepel and Lugtig (2014) reported a similar amount of item non-response for open-ended questions in both PC and mobile devices. With respect to non-substantive responses, Mavletova (2013) finds very low proportions of them (\1 %) in both PCs and mobile devices. No significant differences between devices are observed. Concerning the precision of answers to open-ended questions, several studies (Peterson 2012; Mavletova 2013; Wells et al. 2014; Lambert and Miller 2015) found that mobile respondents provide shorter answers than PC respondents. However, Toepoel and Lugtig (2014) and Buskirk and Andrus (2014) did not find differences. Nevertheless, Buskirk and Andrus (2014) analyse open questions asking for the number of apps owned, the total dollar amount spent on all current apps, and the maximum amount spent on any one app. This is a different kind of open questions (not narrative), where only a few numbers need to be typed. Therefore, different effects can be expected. With respect to the content, Peterson (2012) found similar content in both kinds of devices. Concerning the time of completion, there is evidence that it takes significantly more time to type an answer on a cell phone rather than on a PC (Mavletova 2013). Finally, similarly to what was done for PCs web surveys, some authors studied the impact of changing the size of the textbox when answering through smartphones. For instance, Wells et al. (2014) found that a larger textbox produces longer answers on both PCs and smartphones. Overall, even if some research has been done to compare answers to open-ended questions in mobile devices and PCs, more research is still needed. Indeed, the evidences from the literature are clearly mixed. Therefore, it is very difficult from previous research to deduce what is going to happen in other situations. We are interested in testing the effect of the device: – In Spain, a country for which there were no such previous study – In the frame of a non-probability based panel: this is different, for example, from the research of de Bruijne and Wijnant (2014a, b) – For smartphones only: this is different, for instance, from Mavletova (2013) who also considered feature phones – Focusing on a population of smartphone users: respondents use their own smartphone, contrarily for instance to the experiment of Peytchev and Hill (2010) – And at the beginning of 2015: this provides us updated findings, which is important because of the very quick changes in the situation regarding mobile devices. Moreover, Buskirk and Andrus (2012) mention two approaches to offer mobile web surveys: the passive approach, which simply consists in allowing respondents to access the survey using mobile devices, and the active approach, which detects the kind of device and
123
Open narrative questions in PC and smartphones: is the…
in the case of mobile devices, modifies the layout to improve the survey experience. We want to test both optimised and non-optimised formats for the smartphone respondents. The experimental design used to test the effect of the device and of the format in case of smartphones is explained in the next subsection.
2.2 Experimental design Text entry, and thus, answers to open-ended narrative questions in online surveys, may depend on the input method and respondent familiarity with the device of data collection. In order to study the impact of the device used to answer web surveys on open-ended answers, we use a crossover design: panellists with Internet access through both PC and smartphone devices are invited to participate twice in the same survey using different devices. The only difference between the two waves lies in the introduction to the questionnaire, which is more developed in wave 1. These kinds of designs have two main advantages: first, the influence of confounding covariates is reduced because each crossover respondent serves as his/her own control, and second, optimal crossover designs are statistically efficient, i.e. require fewer subjects than non-crossover designs. In wave 1, respondents are randomly assigned to PC or smartphone groups. In addition, within the smartphone group, respondents are randomly assigned to a non-optimised or an optimised version of the survey. The latter is also called sometimes ‘‘responsive web survey design’’ (see for example de Bruijne and Wijnant 2013). In the smartphone nonoptimised (SNO) version, the survey page is a smaller version of a traditional PC webpage, i.e. it does not adapt to the screen size. Zoom-in and/or scrolling (both vertically and horizontally) are usually necessary to see all the information. On the contrary, in the smartphone optimised (SO) version, the survey program recognises the device and optimises the survey. The questionnaire is tested on mobile devices of different sizes to ensure the display is satisfactory for all of them. The survey page is adapted to the size of the screen, such that respondents do not need to zoom-in nor scroll horizontally, but they may still need to scroll vertically. Unnecessary elements are limited. The size of the buttons is increased. All in all, the SO layout is intended to make it much easier to read and answer the survey through mobile devices. All respondents who finished completing the first wave were invited to participate in the second wave. The time between the two waves was fixed to 1 week to limit the possibility of changes in opinion and at the same time avoid memory effects. Respondents were randomly assigned again to one of the three conditions: PC, SNO, or SO. Thus, combining the two waves, we can distinguish nine different groups, that are presented in Table 1. The nine experimental groups can be divided into three control groups, where respondents answer twice through the same device, and three pairs of treatment groups, where the respondents answer through PC and SNO, or PC and SO, or SO and SNO, the order changing across groups. In order to be able to compare the answers of the same respondents in different conditions, we tried to maximise the proportion of respondents answering to both waves of the survey. This was done, first, by giving the following information to the respondents about the experimental design in the survey introduction of wave 1: – Respondents should participate in both waves – They should use the specific device (PC or smartphone) that we ask for – The incentive they will get for their participation in the survey will be higher in the second wave.
123
M. Revilla, C. Ochoa Table 1 The nine experimental groups Group
Condition
Wave 1
N
Wave 2
N 188
PC
Control
PC
200
PC
SO
Control
Smartphone not optimised
200
Smartphone not optimised
179
SNO
Control
Smartphone optimised
200
Smartphone optimised
187
PC-SNO
Treatment
PC
200
Smartphone not optimised
170
SNO-PC
Treatment
Smartphone not optimised
200
PC
182
PC-SO
Treatment
PC
200
Smartphone optimised
165
SO-PC
Treatment
Smartphone optimised
200
PC
184
SO-SNO
Treatment
Smartphone optimised
200
Smartphone not optimised
179
SNO-SO
Treatment
Smartphone not optimised
200
Smartphone optimised
176
N refers to the number of panellists that finished the survey in a given wave
Moreover, the introduction also included three questions: the first two (‘‘Do you have access to Internet through a PC?’’, ‘‘Do you have access to Internet through a smartphone?’’) allow selecting respondents having Internet access through both devices, whereas the third (‘‘Do you commit yourselves to answer to the two waves of the survey and do it using the device we ask for?’’) is used to filter out of the experiment panellists who do not want to commit themselves. In addition, an automatic check ensures that respondents use the required device: if the registered device does not correspond to the kind of device they should use, respondents are stopped by a warning message until they connect through the right device. Secondly, reminders were sent on days 3, 5 and 7 after the first survey invitation in wave 2, stressing the importance of the participation of the panellists in this second part of the experiment. By looking at the results for waves 1 and 2 in the control groups, we can check whether there is a change between waves 1 and 2. If no changes are observed in the control groups, then, any significant differences found in the treatment groups should come from the change in device. On the contrary, if there is a change in the control group, this gives us an idea of the size of the effect of the wave or of the repetition (e.g. people may be less motivated in wave 2 because they already answered the same questionnaire once). Then, the differences between waves in the treatment groups should be interpreted in relation with the repetition effect. By comparing the results for PC-SNO (similarly SNO-PC) in waves 1 and 2, we can study the effect of the device. By comparing the results for SO-SNO (similarly SNO-SO) in waves 1 and 2, we can study the effect of optimising the smartphone survey. Because order effects may occur, we both include, for instance, a group PCSO and a group SO-PC (counterbalancing). The scope of the experiment is limited to panellists that have Internet access through both PCs and smartphones. It would be interesting in further research to also include tablets and to study what would happen for panellists that do not have access through several devices.
2.3 Hypotheses We presume that it requires more efforts from the respondents to write down answers to open narrative questions through smartphones, mainly because of the differences in
123
Open narrative questions in PC and smartphones: is the…
keyboards. As a consequence, we make the following hypotheses. It takes longer to write the answers to open narrative questions through smartphones than PC (H1a). Thus, answering through smartphones discourages more respondents from giving answers to this kind of questions, increasing the item non-response rate (H2a) and encourages more respondents to write nonsense or ‘‘don’t know’’ answers (H3a). Besides, even when respondents make the effort of providing a proper answer, they tend to provide less detailed answers (H4a) and use more abbreviations (H5a). In the case of open questions, the optimisation of the survey to smartphones affects mainly the readability of the question and the necessity to zoom-in and/or scroll horizontally. Couper and Peterson (2015) found that the need for scrolling is one of the main factors explaining why completion times are higher in surveys completed through smartphones. On the contrary, the answering process is quite similar, since in both cases respondents have to write down their answers in a textbox with a similar device. Therefore, we expect the completion time to be slightly longer for the non-optimised version, due to longer reading time (H1b), but the rest to be similar (H225b).
2.4 Data collection The experiment was carried out by Netquest (www.netquest.com), an online fieldwork company present in Portugal, Spain and Latin America, and accredited with the ISO 26362 quality standard. Netquest is a non-probability based panel, which invites its panellists through email, using a list of people who agreed to receive emails at the end of a short satisfaction survey proposed in one of the websites collaborating with Netquest. Panellists are rewarded for each survey completed, depending on the estimated length of the questionnaire. The experiment was conducted in Spain. The data of the first wave were collected from the 23rd of February to the 2nd of March 2015, and the data of the second wave from the 9th to the 18th of March. In wave 1, the target sample size was 200 respondents finishing the survey in each of the nine groups, corresponding to a total of 1800 complete surveys. Cross quotas for age and gender were used to guarantee a similar distribution for these variables in the sample than in the panel. In total, 2720 panellists got to the introduction page with the filter questions. 169 of them were filtered out because they did not have Internet access through both PCs and smartphones and another 17 because they did not access Internet with a Smartphone in the past 30 days. Then, 119 were excluded because they refused to commit themselves to answer both waves and using the required devices. In addition, 296 were required to continue the survey from a different device than the one they started with but did not do the switch, even if they just committed themselves. Finally, 1843 answered the first survey question after all the filters and 1800 finished the survey. This was the objective. In wave 2, out of the 1800 invited panellists 1610 started and finished the survey, which correspond to a response rate of 88.9 %. However, two respondents are excluded from the analyses because they have missing values for almost all questions, except one open question where they explain that they were not willing to answer again the same questions. At the end, we have 554 respondents for the PC group, 518 for the SO group, and 536 for the SNO group. The size of the nine experimental groups presented in Table 1 is quite similar varying from a minimum of 165 panellists in the PC-SO group (10.3 % of the 1608) to a maximum of 188 in the PC–PC group (11.7 %).
123
M. Revilla, C. Ochoa
2.5 Questions studied The questionnaire consisted in around 100 questions, mainly about sensitive behaviours.1 The complete questionnaire such as proposed to the respondents in wave 1 can be found at http://goo.gl/g9gAE4 for PC; http://goo.gl/5jF2vr for SO; and http://goo.gl/4c9d1C for SNO. The only difference from wave 2 is the introduction. Because of the sensitive topics included, respondents were allowed to continue the survey without answering some questions. Since most Netquest surveys do not allow respondents to continue without answering all items, this was announced in the introduction. However, each time respondents skipped four questions, they got a message encouraging them to answer. Table 2 presents the three open narrative questions available in the questionnaire that are studied in this paper. The first two questions are following up after a closed question on a similar topic, and asking respondents to give more details about the answer they selected previously. The third question is coming after a grid of 14 items about immigrants. All three questions are very broad. Thus, respondents can provide very different answers, but we expect most respondents to have something to say on these general topics.
2.6 Kind of keyboards and direction of the screen The main reason why we expected differences between smartphones and PCs respondents for open narrative questions is related to the difference in keyboards. In this study, 98.7 % of the smartphone respondents in wave 1 use the virtual keyboard appearing on the screen, whereas only 1.3 % uses a physical keyboard. The experience of typing on virtual keyboards is quite different from the one of typing on the physical keyboards used by all PC respondents. The main reason why we expected differences between optimised and non-optimised smartphone versions of the survey is related with the readability of the questions. In SNO surveys, since the questions do not adapt to the screen size, respondents may need to scroll horizontally to read it. To reduce the necessity to scroll horizontally, SNO respondents may use the landscape view instead of the portrait view of their smartphones. This idea is supported by our data. Significant differences are observed between the SO and SNO groups in terms of the proportions of respondents using the smartphones in landscape view: in wave 1, 34.6 % of the SNO group use the smartphones in landscape view, whereas only 9.9 % in the SO group do so (z = 10.27; p = .000); in wave 2, the difference is smaller, but still these proportions are respectively 28.0 and 11.6 % (z = -6.65; p = .000).
3 Main results 3.1 Completion time per character (H1a–b) Our first two hypotheses are that it takes longer a) to answer online surveys using smartphones than using PCs, and b) for smartphones, to answer when the survey is not optimised. In the first case, this is because it is longer to type on a smartphone, and in the second case, because it is longer to read in the non-optimised version. Since the open questions are asked alone on a webpage, using paradata, we can register the time each respondent spent on each 1
A large part of the questions were inspired by the study of Mavletova and Couper (2013), but not the open narrative questions used in this study.
123
Open narrative questions in PC and smartphones: is the… Table 2 Open questions used for our analyses Name
Question text
Law (Open 1)*
Please, explain in details on which argument your previous answer is based.
Euthanasia (Open 2)
You have commented that euthanasia is justified [always/in most cases/sometimes/ never]. Please, explain for which reasons do you think so.
Immigrant (Open 3)
In general, what is your opinion about immigrants?
* The Law question comes after the following closed question: ‘‘Do you think that it is important that everybody respects the law? Yes–No’’
of these questions. However, we do not know what is the reading time compared to the typing time. Thus, what we compare is the completion time in seconds per character for each of the three open narrative questions in the PC, SO and SNO groups. This is computed for each respondent by dividing, for a given open question, the time spent on the page of the open question by the number of characters written. In this way, we get an approximation of the speed of answer. People who did not provide any answer to the question (i.e. wrote 0 characters) are excluded from this analysis. Table 3 presents the median (more robust to outliers than the mean) per group of these completion times per character written as well as the p values of the two-sample Wilcoxon rank-sum (Mann–Whitney) test of significance of differences between PC, SO and SNO within waves. In both waves, significant differences are observed between PC and both smartphones groups for all three open narrative questions. Therefore, we conclude that the device has an impact on the completion time per character: it takes longer to answer the questions in smartphones (while controlling for the length of the answer). This supports H1a. Besides, significant differences are observed between SO and SNO groups, for all questions in wave 2, and one or two of the three questions in wave 1, depending the level of error considered. This suggests that the optimisation has an impact on the completion time per character for smartphone respondents, with longer times for the SNO, even if it is not always significant (H1b). To check further our hypotheses, we also compute for each of the 1608 panellists answering in both waves the difference in completion time per character between the two waves, for each of the open narrative questions. In that way, since we compare each person with his/herself, we control for all potential differences (e.g. in personal skills or attitude toward surveys), such that the observed differences can really come only from the treatments or eventually from some time/repetition effect. Table 4 presents the median of the individual differences in completion time per character between the two waves for each of the 9 experimental groups. In the three control groups, since the device is kept the same, differences can only come from the time/repetition effect. However, Table 4 shows that the medians of the differences are all around 0 and all not significant using the Wilcoxon signed-rank test of equality with 0, except for ‘‘Immigrant’’ in the SO–SO group (p value = .046). Thus, there is no time/ repetition effect visible: the completion time per character in wave 2 is similar to the one in wave 1, in median. Therefore, the differences observed in the treatment groups can be interpreted as effects of the treatments, since there is no effect of the wave in this case. When the respondents change from PC to smartphones and vice versa, the medians of the differences are always significantly different from 0. The order does not seem to play a role, since the medians (in absolute values) are very similar when PC is first or second. This again supports our first hypothesis (H1a) that it takes longer to type survey answers on a smartphone (.14 to .25 s more per character).
123
M. Revilla, C. Ochoa Table 3 Completion time in seconds per character for the 3 questions (median) Wave 1
Wave 2
PC
SO
SNO
ppc-so
ppc-sno
pso-sno
PC
SO
SNO
ppc-so
ppc-sno
pso-sno
Law
.48
.73
.76
.00
.00
.18
.47
.68
.77
.00
.00
.00
Euthanasia
.44
.65
.70
.00
.00
.09
.42
.63
.71
.00
.00
.00
Immigrant
.46
.68
.76
.00
.00
.01
.47
.70
.82
.00
.00
.00
ppc-so corresponds to the p value of the two-sample Wilcoxon rank-sum (Mann–Whitney) test, the two samples being the PC and the SO groups in a given wave. Idem for ppc-sno and pso-sno but for respectively the two groups PC and SNO; and SO and SNO
Table 4 Median of individual changes in completion time per character from wave 1 to wave 2 (in seconds per character) Control groups
Treatment groups
PC-PC
SO-SO
SNO-SNO
PC-SO
SO-PC
PC-SNO
SNO-PC
SO-SNO
SNO-SO
Law
.02
-.00
Euthanasia
.02
.01
.00
-.14
.19
-.19
.21
-.05
.04
-.00
-.14
.20
-.17
.21
-.04
Immigrant
.01
.05
-.01
.04
-.18
.16
-.25
.21
-.05
.03
On the contrary, the medians of the differences between the groups getting SO and SNO are small (.03 to .05 in absolute values) and not always significantly different from 0. The optimisation for open narrative questions only improves the reading of the question itself, but not really the answering process. Thus, the small differences are probably due to a little gain in reading the question and the fact that the reading part represents only a small part of the overall answering process for an open narrative question. When focusing on individual changes across waves, the support for H1b appears to be even more limited.
3.2 Non answer, nonsense and ‘‘don’t know’’ (H2a–b and H3a–b) If answering open narrative questions asks more effort from the respondents, then, we can expect that some of them try to reduce the burden by not answering the question at all or providing non substantive answers for which they do not need to think, like ‘‘don’t know’’ answers (DK) or nonsense answers (i.e. just some numbers or letters without any meaning, or a few words that do not answer at all the question). Table 5 shows the proportions of respondents in both waves corresponding to these different situations. Overall, Table 5 shows that there are very few significant differences. This means that there is little support for the hypotheses suggesting differences between PCs and smartphones (H2a and H3a). On the contrary, there is high support for the hypotheses H2b and H3b: there are no significant differences at all between SO and SNO. When comparing PCs and smartphones groups, there are a few significant differences: for example, for item non-response, there are significantly more non-answers for the open question about immigrants in smartphones (both SO and SNO) than in PCs in wave 1. This might be related with the topic. However, these are only occasional differences. The general findings underline the similarity between PCs and smartphones’ answers for non-response, nonsense and DK answers.
123
Substantive Answer
Don’t Know
Nonsense
No answer
94.0
94.0
Euthanasia
Immigrant
1.3
91.2
Immigrant
Law
.5
.5
Law
.8
Euthanasia
1.7
Euthanasia
Immigrant
3.8
2.7
Immigrant
Law
5.7
3.8
Law
Euthanasia
6.5
91.0
92.7
90.3
1.0
.5
.7
1.5
2.8
2.5
6.5
4.0
7.2
89.6
94.1
90.5
1.8
.2
.2
1.7
2.2
2.2
6.9
3.5
.05
.35
.60
.59
1.0
.70
.28
.17
.86
.04
.87
.54
ppc-so
.01
.93
.66
.48
.32
.32
.19
.52
.58
.02
.78
.28
ppc-sno
.48
.30
.94
.22
.32
.18
.81
.46
.71
.80
.66
.64
pso-sno
91.5
91.2
91.2
1.1
.2
.5
1.8
2.3
1.6
5.6
6.3
6.7
PC
SNO
PC
SO
Wave 2
Wave 1
89.4
90.2
91.5
1.1
.6
.2
2.5
2.9
2.3
6.9
6.3
6.0
SO
Table 5 Proportions of non answer, nonsense, DK, and meaningful substantive answers to the 3 questions in waves 1 and 2
86.7
90.3
89.7
2.2
.2
0
3.0
3.2
3.2
8.0
6.3
7.1
SNO
.24
.59
.82
.91
.29
.35
.43
.58
.42
.37
.98
.63
ppc-so
.01
.63
.43
.13
.98
.09
.20
.40
.09
.11
.99
.79
ppc-sno
.18
.95
.32
.17
.30
.31
.63
.79
.39
.50
1.0
.46
pso-sno
Open narrative questions in PC and smartphones: is the…
123
M. Revilla, C. Ochoa
Again, we can also study the individual differences by taking the differences between answers at waves 1 and 2 for each of the respondents completing both waves. Table 6, in the upper part, reports the proportions for each experimental group of respondents that gave an answer (whatever it is) in one wave but not in the other, and in the lower part, the proportions of respondents that gave a valid substantive answer in one wave but not in the other (no answer, nonsense or Don’t know). A very large majority of respondents (from 84 to 96 %) does not change between waves from response to non-response or from substantive answers to non-substantive answers or non-response. Still, in all experimental groups, there are significant proportions of respondents changing from non-answer or from substantive answers to one of the three other categories. Since the proportions are similar between control and treatment groups in most cases, the changes are most probably linked to the wave effect than to the device or the optimisation of the survey presentation.
3.3 Precision of answers (H4a–b) 3.3.1 Numbers of characters typed Besides the facts of providing an answer, and of providing a substantive answer, for openended narrative questions, another important aspect is how detailed is the answer. This is measured by the number of characters that respondents typed for the different questions. Table 7 shows the median number of characters per group focusing on the panellists that
Table 6 Percentages of panellists moving between waves… Control groups PC-PC
SO-SO
Treatment groups SNO-SNO
PC-SO
SO-PC
PC-SNO
SNO-PC
SO-SNO
SNO-SO
… from no answer to one of the 3 other categories or vice versa Law
7.4
8.9
5.9
6.1
4.3
5.9
7.1
7.3
4.6
Euthanasia
7.4
7.3
4.8
4.9
4.9
4.1
4.9
5.0
6.3
Immigrant
4.3
7.8
5.3
4.2
4.9
9.4
5.5
9.5
7.5
… from substantive answer to one of the 3 other categories or vice versa Law
8.0
11.7
12.3
10.3
7.1
11.8
10.4
10.1
8.6
Euthanasia
8.0
11.7
10.2
8.5
7.1
9.4
8.2
8.4
9.2
Immigrant
6.4
11.7
12.3
7.3
7.1
15.9
9.9
14.5
10.9
Table 7 Number of characters for the 3 questions (median per group) Wave 1 PC Law
SO
Wave 2 SNO
ppc-so
ppc-sno
pso-sno
PC
SO
SNO
ppc-so
ppc-sno
pso-sno
77
64
70
.00
.04
.02
65
56
53
.00
.00
.64
Euthanasia
108
87
87
.00
.00
.89
85
66
64
.00
.00
.87
Immigrant
107
84
80
.00
.00
.91
74
60
54
.00
.00
.20
ppc-so corresponds to the p value of the two-sample Wilcoxon rank-sum (Mann–Whitney) test, the two samples being the PC and the SO groups in a given wave. Idem for ppc-sno and pso-sno but for respectively the two groups PC and SNO; and SO and SNO
123
Open narrative questions in PC and smartphones: is the…
gave an answer (i.e. 0 is excluded), together with the p values of the two-sample Wilcoxon rank-sum (Mann–Whitney) test. Table 7 shows that panellists write significantly more characters (in median) when answering through PCs compared with smartphones, for all three questions, in both waves, and both for optimised and non-optimised smartphones surveys. On the contrary, the differences between optimised and non-optimised surveys are in general not significant. Thus, hypotheses H4a and H4b are supported. When looking at individual changes from wave 1 to wave 2, Table 8 gives the median of the difference in numbers of characters typed in. Looking at the control groups, we see a clear impact of the wave: respondents write significantly shorter answers in wave 2 for all three questions and control groups. This can be linked to the repetition of the questions, which reduces the willingness to develop a precise answer at the second occasion. A few respondents mentioned in their answers that they already answered in wave 1 and did not want to repeat the arguments again. In addition, when answering through PC first and smartphone second (groups PC-SO and PC-SNO), the differences are significantly larger than in the control group PC–PC for all three questions. This suggests that there is not only an effect of the wave (otherwise we would expect similar differences in the treatments groups as in the control group), but also an effect of the device. The same respondent tends to write shorter answers when answering through a smartphone in wave 2 after having answered in wave 1 through a PC. On the contrary, when the respondents answer through smartphones in wave 1 and PCs in wave 2 (groups SO-PC and SNO-PC), the differences across waves are small, and often not significantly different from 0. Besides, they are smaller than what expected based on the control groups observed differences. This suggests that there is both an effect of the wave and an effect of the device, which go in opposite directions. Thus, the negative effect of the repetition (i.e. shorter answers more likely in wave 2) is compensated by the change from smartphone to PC (i.e. longer answers more likely in PCs), leading to similar lengths in both waves. Finally, the differences between optimised and non-optimised versions of the smartphone survey are quite similar to the ones in the control group and all positive, suggesting that they are mainly due to the effect of time, and not to the optimisation of the presentation of the survey. This was expected, since the experience of typing in an answer in a textbox is not really improved in the optimised version (what is improved is the reading of the question). So overall, the results support H4a and H4b.
3.3.2 Can the differences be due to the use of abbreviations (H5ab)? So far we have measured the precision of the answers through the number of characters typed in. However, there is the possibility that respondents use more abbreviations while
Table 8 Median of differences in numbers of characters from wave 1 to wave 2 Control groups PC-PC
SO-SO
Treatment groups SNO-SNO
PC-SO
SO-PC
PC-SNO
SNO-PC
SO-SNO
SNO-SO
Law
12
9
14
18
-4
17
4
5
Euthanasia
13
20
14
35
-12
38
-4
17
13 12
Immigrant
20
13
11
49
0
53
0
14
13
123
M. Revilla, C. Ochoa Table 9 Use of abbreviations for the three open-ended questions (% of yes) Wave 1
Wave 2
PC
SO
SNO
ppc-so
ppc-sno
pso-sno
PC
SO
SNO
ppc-so
ppc-sno
pso-sno
Law
0
5.6
2.7
.00
.00
.02
1.0
3.1
4.1
.02
.00
.40
Euthanasia
.7
5.8
5.7
.00
.00
.97
.8
6.4
5.6
.00
.00
.59
Immigrant
.2
5.5
5.0
.00
.00
.70
.8
4.8
5.7
.00
.00
.50
p value from two samples test of proportions (t test)
answering through smartphones. Then, the number of characters written would be reduced but the content could still be as precise as in the PCs answers. Thus, Table 9 reports the proportions of respondents that use abbreviations in their answers to the open narrative questions, for each group and wave. First, there are significant differences between the PC and the smartphones groups: more abbreviations are observed when smartphones are used. Second, except for the law question in wave 1, no significant differences are found between the two smartphones’ conditions. The results go in the direction of H5a and H5b. However, respondents answering through PCs hardly use any abbreviations. In the smartphone groups, there are also only few respondents using abbreviations. Even if it is very common nowadays to write on social media or chat using abbreviations, most panellists seem to consider that abbreviations are not of normal use when answering a survey. Over all the responses, respondents who abbreviated did it only for the Spanish words ‘‘que’’ (typing ‘‘q’’ or ‘‘k’’), or ‘‘porque’’ (typing ‘‘pq’’ or ‘‘xq’’), and in very few cases for ‘‘de’’ (typing ‘‘d’’). Thus, the use of abbreviations cannot explain the differences in numbers of characters written observed in the previous subsection (3.3.1). It actually represents a very small gain in numbers of characters. Table 10 reports the proportions of respondents who used abbreviations in one wave but not in the other among the ones that answered to a given open question in both waves. Except in the PC–PC control group, there are small but significant proportions of respondents who abbreviated in one wave but not in the other: the bigger proportion is 9.7 %, which correspond to 15 respondents. This happens even more in the control groups than in the treatment groups, in particular for ‘‘Euthanasia’’ and ‘‘Immigrant’’. Besides the possible wave effect, the probability to abbreviate might depend on the topic, the formulation and the length of the answers. Indeed, the abbreviations are concentrated on very few words (‘‘que’’, ‘‘porque’’, ‘‘de’’). If respondents formulate an answer not including these three words, then, they are somehow missing the opportunity to abbreviate. Thus, we might not detect in some answers that a respondents has a tendency to abbreviate, just because none of the words where s/he would have done so were present in his/her answer. This makes the individual comparisons difficult to interpret. Therefore, we will only focus on the main conclusion from Table 9: there are more abbreviations in smartphone answers.
3.3.3 Background variables, mobile web usage and context variables Since differences in the number of characters typed cannot be due to the use of abbreviations, we also consider other variables that could affect the precision of answers: – Some background variables: gender, age, and education
123
8.7
3.7 7.1
6.9
3.3 3.9
6.0
7.8 4.8
5.4 6.0
9.5
4.5
PC-SNO
Because of non-response, the number of observations differ for the three questions. It is indicated in parenthesis (n =)
.6
Immigrant (n = 1450)
8.7
7.0
.6
1.2
Law (n = 1428)
Euthanasia (n = 1469)
SO-PC
PC-SO
SO-SO
PC-PC
SNO-SNO
Treatment groups
Control groups
Table 10 Proportions of respondents using abbreviations in one wave but not the other
2.4
3.5
3.1
SNO-PC
9.7
5.6
5.8
SO-SNO
7.1
7.6
4.5
SNO-SO
Open narrative questions in PC and smartphones: is the…
123
M. Revilla, C. Ochoa
– Some mobile web usage variables: since when respondents used a smartphone to go online (number of months), and how frequently they connect to Internet through a smartphone (from ‘‘1- Everyday’’ to ‘‘6- Less than once a month’’) – Some context variables: direction of the screen (dummy variable = 1 if portrait direction), place of completion (dummy variable = 1 if completed at home), presence of other people around the panellist while s/he was completing the survey (dummy variable = 1 if other persons present), multitasking, i.e. doing other activities while completing the survey (number of activities the respondents report they have been doing from a list of six activities), satisfaction with the Internet connection speed during the survey completion (dummy variable = 1 if satisfied), and payment for Internet fixed (dummy variable = 1 if the payment does not depend on the number of mega used). Since the same respondents answer in two waves, our observations are nested within individuals. Hence, we first compute the Interclass Correlation Coefficient (ICC): 50.9 % of the total variance is explained at the individual level for the ‘‘Law’’ question, 50.5 % for the ‘‘Euthanasia’’ question, and 45.0 % for the ‘‘Immigrant’’ question. Thus, it is necessary to consider the second level, i.e. individuals (Occhipinti 2012, pp. 4–5). Therefore, we study the precision of answers by running for each question Mixed-Models (MM) with observations nested in individuals. We include as explanatory variables the following: wave or occasion effect (i.e. the effect of answering in wave 2 instead of wave (1), answering through PC instead of SNO, using a SO version versus a SNO version, the background variables, mobile web usage and context variables mentioned before. Results are presented in Table 11. First, previous conclusions are confirmed. For all three open questions, there is a significant effect of time: shorter answers are provided in wave 2. In addition, there is a
Table 11 Explaining the precision of answers with different kinds of explanatory variables (MM) Variables
Law
Euthanasia
Immigrant
Coeff.
P[z
Coeff.
P[z
Coeff.
P[z 0.00
Time
Wave 2
-13.4
0.00
-20.0
0.00
-24.8
Device
PC
10.9
0.00
16.1
0.00
22.4
0.00
SO
-2.7
0.20
-1.4
0.56
2.5
0.38
Background
Age
-.6
0.00
-.7
0.00
-.3
0.09
Education
2.6
0.05
2.3
0.11
3.4
0.05
Men
1.3
0.31
2.7
0.08
3.3
0.08
Mobile
Months using smartphone
-.1
0.25
.0
0.68
-.1
0.39
1.6
0.28
-1.7
0.31
-.9
0.64
Context
Portrait
-2.6
0.35
-4.8
0.13
-6.5
0.08
At home
-5.1
0.04
2.2
0.41
-3.1
0.33
Others present
-1.7
0.42
1.3
0.58
4.9
0.09
Multitasking
-.6
0.68
-1.3
0.44
-.1
0.94
Satisf. internet speed
1.2
0.76
-.9
0.84
6.4
0.23
.6
0.85
2.6
0.51
1.8
0.70
101.3
0.00
130.1
0.00
108.0
0.00
Freq. internet through smartphone
Fixed payment internet Constant
123
Open narrative questions in PC and smartphones: is the…
positive effect of answering through PC instead of SNO. On the contrary, there is no significant effect of answering through SO instead of SNO. Second, concerning the background variables, age has a significant negative effect at the 5 or 10 % level for all open questions. Education also has a significant effect: more educated people type longer answers. Gender has a significant effect only at the 10 % level for ‘‘Euthanasia’’ and ‘‘Immigrant’’. Finally, there is no significant effect for the mobile web usage variables, and for the context variables (except for answering at home in the ‘‘Law’’ question). In conclusion, the precision is mainly affected by the occasion effect, answering through PC or smartphone, and the socio-demographic characteristics of the respondents.
4 Discussion Even if most survey questions use a closed format, open narrative questions are regularly used to get deeper information about the respondents’ attitudes or opinions. These questions can provide rich information that complements very well the answers to other closed questions in the survey. However, in online surveys, the open narrative questions usually require respondents to type in their answer. Thus, the typing skills and conditions can affect the answers. One element influencing the typing conditions is the device used to complete the survey. More and more respondents participate in online surveys through mobile devices instead of PCs. In the case of smartphone particularly, the typing experience is very different than in the case of a PC: in most cases, respondents have to type on a virtual keyboard, on a small screen, instead of typing on a physical keyboard of much larger dimensions. In addition, smartphones respondents may more easily answer from any place, and be involved in multitasking. However, these two context variables did not show any significant effect on the precision of open narrative questions in the mixed-models analyses. In order to study these questions, an experiment was carried out with Netquest panellists in Spain in the beginning of 2015. Panellists with Internet access both through PC and smartphone available were randomly assigned to answer the survey through PC, SO or SNO, in two successive waves. Table 12 summarizes the main results of the experiment. We found that the device has an impact on the speed of answer (longer response time per character written in smartphones), the precision of the answers (lower number of characters written in smartphones) and the use of abbreviations (higher in smartphones, but overall quite low, so it cannot explain the differences in number of characters). On the contrary, the item non-response, nonsense and don’t know answers did not vary significantly between PC and smartphone answers. Table 12 Summary of the results PCs versus smartphones
SO versus SNO
Verified a1b
H1: Speed of answer
a: longer in smartphones
b: slightly longer for SNO
H2: Item non response
a: higher in smartphones
b: no difference
Only b
H3: DK and nonsense
a: higher in smartphones
b: no difference
Only b
H4: Precision of answers
a: lower in smartphones
b: no difference
a1b
H5: Use of abbreviations
a: higher in smartphones
b: no difference
a1b
123
M. Revilla, C. Ochoa
Regarding the optimisation, significant differences were found only for the speed of answer, which is most probably linked to the necessity to read the question to zoom-in and scroll horizontally in the non-optimised version. The optimisation does not really play a role for the experience of typing, so it is not surprising not to find few or no differences at all. Overall, the results suggest that it may be problematic to rely on the information from open narrative questions if many respondents are completing the survey through smartphones, since respondents tend to spend more time per character written and provide shorter answers. Thus, there is a loss in precision. Therefore, if online panel companies have a high proportion of smartphone responses, we recommend them to avoid using open narrative questions in their current format. As mentioned previously, the current optimised smartphone format does not really optimise the answering process to open narrative questions. Online panel companies could try to improve the format for such survey questions, for instance taking advantage of Internet possibilities to record answers to open narrative questions. In particular, more and more applications are making use of voicedictation functions. Offering such a tool to answer open narrative survey questions sounds like a very attractive option. However, the experiment was implemented in a non-probability based online panel, by selecting only panellists with Internet access through both a PC and a smartphone. The experiment was also conducted in a single country, Spain. Therefore, it is not possible to generalise the findings to different web surveys, in other contexts or countries. More experiments should be conducted to test if it is possible to replicate the findings in different conditions. Moreover, mobile devices are evolving extremely quickly. The size of the screen and therefore of the keyboard has increased a lot just in the last couple of years. People are also getting used more and more to type on their smartphones and to the use of virtual keyboards and touchscreens. All these changes go very fast and could change the pattern observed in this study in the next few years. Further research needs to follow these evolutions. In addition, there is a large diversity of smartphones. Variations in the operating systems (e.g., Android or Apple iOS), in the browsers, in the processing power, in the screen size and the screen resolution, in the kind of keyboard (physical or virtual), etc., may affect the survey experience and answers. Therefore, further research could compare answers to open narrative questions for different types of smartphones, not only in comparison with PCs. What happens with tablets would also be worth studying. Acknowledgments We are very grateful to the Netquest team for the support in planning and collecting the necessary data. We would also like to thank Daniele Toninelli for his help in designing the experiment and his very useful comments on a previous draft of this paper, and the anonymous reviewers who gave us many valuable suggestions.
References Buskirk, T.D., Andrus C.H.: Smart Surveys for Smartphone: Exploring Various Approaches for Conducting Online Mobile Surveys Via Smartphones. Survey Practice. http://surveypractice.wordpress.com/2012/ 02/21/smart-surveys-for-smart-phones/ (2012) Buskirk, T.D., Andrus, C.H.: Making mobile browser surveys smarter: results from a randomized experiment comparing online surveys completed via computer or smartphone. Field Methods 26(4), 322–342 (2014). doi:10.1177/1525822X14526146
123
Open narrative questions in PC and smartphones: is the… Christian, L.M., Dillman, D.A.: The influence of graphical and symbolic language manipulations on responses to self-administered questions. Public Opin. Q. 68(1), 57–80 (2004) Couper, M.P., Kennedy, C., Conrad, F.G., Tourangeau, R.: Designing input fields for non-narrative openended responses in web surveys. J. Off. Stat. 27(1), 65–85 (2011) Couper, M, Peterson G. : Exploring Why Mobile Web Surveys Take Longer. Presented at the General Online Research conference, 18 to 20 March 2015 in Cologne. https://conftool.gor.de/conftool15/ index.php?page=browseSessions&form_session=26&presentations=show (2015) de Bruijne, M, Wijnant A.: Can mobile web surveys be taken on computers? a discussion on a multi-device survey design. Surv. Pract. 6(4) (2013) de Bruijne, M., Wijnant, A.: Mobile response in web panels. Soc. Sci. Comput. Rev. 32(6), 728–742 (2014a) de Bruijne, M., Wijnant, A.: Improving response rates and questionnaire design for mobile web surveys. Public Opin. Q. 78(4), 951–962 (2014b). doi:10.1093/poq/nfu046 Emde, M., Fuchs M.: Using adaptive questionnaire design in open-ended questions: a field experiment. In: JSM Proceedings, Statistical Computing Section. American Statistical Association, Alexandria (2012) Emde, M., Fuchs, M.: Using interactive feedback to enhance response quality in Web surveys: The case of open-ended questions. Presentacio´n GOR conferencia 2013 (Mannheim): http://conftool.gor.de/ conftool13/index.php?page=browseSessions&presentations=show&form_session=14 (2013) Israel, G.D.: Effects of answer space size on responses to open-ended questions in mail surveys. J. Off. Stat. 26(2), 271–285 (2010) Kaikkonen, A.: Mobile internet: past, present, and the future. Int. J. Mob. Hum. Comput. Interact. 1, 29–44 (2009) Lambert, A.D., Miller, A.L.: Living with smartphones: does completion device affect survey responses? Res. High Educ. 56, 166–177 (2015). doi:10.1007/s11162-014-9354-7 Mavletova, A.: Data quality in PC and mobile web surveys. Soc. Sci. Comput. Rev. 31(4), 725–743 (2013) Mavletova, A., Couper M.P.: Sensitive topics in pc web and mobile web surveys: is there a difference? Surv. Res. Methods 7(3):191–205. https://ojs.ub.uni-konstanz.de/srm/article/view/5751 (2013) Occhipinti, S.: Mixed modelling using stata. Workshop for GSBRC. http://www.griffith.edu.au/__data/ assets/pdf_file/0011/439346/Stata_mixed_intro-1.pdf (2012) Peterson, G.: Unintended mobile respondents. Paper presented at CASRO technology conference, New York. http://c.ymcdn.com/sites/www.casro.org/resource/collection/D0686718-163A-4AF4-A0BB8F599F573714/Gregg_Peterson_-_Market_Strategies.pdf (2012) Peytchev, A., Hill, C.A.: Experiments in mobile web survey design: similarities to other modes and unique considerations. Soc. Sci. Comput. Rev. 28(3), 319–335 (2010) Revilla, M., Toninelli, D., Ochoa, C., Loewe, G.: Do online access panels really need to allow and adapt surveys to mobile devices? Internet Res. (fothcoming) Smyth, J.D., Dillman, D.A., Christian, L.M., Mcbride, M.: Open-ended questions in web surveys. Can increasing the size of answer boxes and providing extra verbal instructions improve response quality? Public Opin. Q. 73(2), 325–337 (2009) Toepoel, V., Lugtig, P.: What happens if you offer a mobile option to your web panel? Evidence from a probability-based panel of Internet users. Soc. Sci. Comput. Rev. 32(4), 1–17 (2014) Wells, T., Bailey, J.T., Link, M.W.: Filling the void: Gaining a better understanding of tablet-based surveys. Surv. Pract. 6 (2013) Wells, T., Bailey, J.T., Link, M.W.: Comparison of smartphone and online computer survey administration. Soc. Sci. Comput. Rev. 32(2), 238–255 (2014) Zahariev, M., Ferneyhough, C., Ryan, C.: Best practices in mobile research. Paper presented at ESOMAR Online Research, Chicago, Oct 26–28 (2009) Zuell, C., Menold, N., Ko¨rber, S.: The influence of the answer box size on item nonresponse to open-ended questions in a web survey. Soc. Sci. Comput. Rev. 33(1), 115–122 (2015). doi:10.1177/0894439314528091
123