No Access Submitted: 10 January 2017 Accepted: 16 March 2017 Published Online: 10 April 2017
The Journal of the Acoustical Society of America 141, 2474 (2017); https://doi.org/10.1121/1.4979470
more...View Affiliations
View Contributors
  • Leonard Varghese
  • Samuel R. Mathias
  • Seth Bensussen
  • Kenny Chou
  • Hannah R. Goldberg
  • Yile Sun
  • Robert Sekuler
  • Barbara G. Shinn-Cunningham
Cross-modal interactions of auditory and visual temporal modulation were examined in a game-like experimental framework. Participants observed an audiovisual stimulus (an animated, sound-emitting fish) whose sound intensity and/or visual size oscillated sinusoidally at either 6 or 7 Hz. Participants made speeded judgments about the modulation rate in either the auditory or visual modality while doing their best to ignore information from the other modality. Modulation rate in the task-irrelevant modality matched the modulation rate in the task-relevant modality (congruent conditions), was at the other rate (incongruent conditions), or had no modulation (unmodulated conditions). Both performance accuracy and parameter estimates from drift-diffusion decision modeling indicated that (1) the presence of temporal modulation in both modalities, regardless of whether modulations were matched or mismatched in rate, resulted in audiovisual interactions; (2) congruence in audiovisual temporal modulation resulted in more reliable information processing; and (3) the effects of congruence appeared to be stronger when judging visual modulation rates (i.e., audition influencing vision), than when judging auditory modulation rates (i.e., vision influencing audition). The results demonstrate that audiovisual interactions from temporal modulations are bi-directional in nature, but with potential asymmetries in the size of the effect in each direction.
This work was funded by CELEST, a National Science Foundation Science of Learning Center (SBE-0354378), and SL-CN: Engaging Learning Network, a National Science Foundation Collaborative Network (SMA/SBE-1540920). We would like to thank Lorraine Delhorne for conducting hearing screenings on the individuals who took part in this study. We would also like to thank Diego Fernandez-Duque and three anonymous reviewers for their comments on an earlier version of this manuscript.
  1. 1. Alais, D., and Burr, D. (2004). “ The ventriloquist effect results from near-optimal bimodal integration,” Curr. Biol. 14, 257–262. https://doi.org/10.1016/j.cub.2004.01.029, Google ScholarCrossref
  2. 2. Bates, D., Mächler, M., Bolker, B., and Walker, S. (2015). “ Fitting linear mixed-effects models using lme4,” J. Stat. Softw. 67, 1–48. https://doi.org/10.18637/jss.v067.i01, Google ScholarCrossref, ISI
  3. 3. Bizley, J. K., Maddox, R. K., and Lee, A. K. C. (2016). “ Defining auditory-visual objects: Behavioral tests and physiological mechanisms,” Trends Neurosci. 39, 74–85. https://doi.org/10.1016/j.tins.2015.12.007, Google ScholarCrossref
  4. 4. Bizley, J. K., Shinn-Cunningham, B. G., and Lee, A. K. C. (2012). “ Nothing is irrelevant in a noisy world: Sensory illusions reveal obligatory within-and across-modality integration,” J. Neurosci. 32, 13402–13410. https://doi.org/10.1523/JNEUROSCI.2495-12.2012, Google ScholarCrossref
  5. 5. Bogacz, R., Brown, E., Moehlis, J., Holmes, P., and Cohen, J. D. (2006). “ The physics of optimal decision making: A formal analysis of models of performance in two-alternative forced-choice tasks,” Psychol. Rev. 113, 700–765. https://doi.org/10.1037/0033-295X.113.4.700, Google ScholarCrossref
  6. 6. Cavanagh, J. F., Wiecki, T. V., Cohen, M. X., Figueroa, C. M., Samanta, J., Sherman, S. J., and Frank, M. J. (2011). “ Subthalamic nucleus stimulation reverses mediofrontal influence over decision threshold,” Nat. Neurosci. 14, 1462–1467. https://doi.org/10.1038/nn.2925, Google ScholarCrossref
  7. 6. Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A., and Ghazanfar, A. A. (2009). “ The natural statistics of audiovisual speech,” PLoS Computat. Biol. 5(7), e1000436. https://doi.org/10.1371/journal.pcbi.1000436, Google ScholarCrossref
  8. 7. Cohen, J. (1992). “ A power primer,” Psychol. Bull. 112, 115–159. https://doi.org/10.1037/0033-2909.112.1.155, Google ScholarCrossref
  9. 8. Denison, R. N., Driver, J., and Ruff, C. C. (2013). “ Temporal structure and complexity affect audio-visual correspondence detection,” Front. Psychol. 3, 619. https://doi.org/10.3389/fpsyg.2012.00619, Google ScholarCrossref
  10. 9. Euston, D. R., Gruber, A. J., and McNaughton, B. L. (2012). “ The role of medial prefrontal cortex in memory and decision making,” Neuron 76, 1057–1070. https://doi.org/10.1016/j.neuron.2012.12.002, Google ScholarCrossref
  11. 10. Faraway, J. J. (2014). Linear Models With R, 2nd ed. ( CRC Press, Boca Raton, FL). Google Scholar
  12. 10. Fleiss, J. L., Cooper, H., and Hedges, L. V., eds. (1994). The Handbook of Research Synthesis ( Russell Sage Foundation, New York), pp. 245–260. Google Scholar
  13. 11. Forstmann, B. U., Ratcliff, R., and Wagenmakers, E.-J. (2016). “ Sequential sampling models in cognitive neuroscience: Advantages, applications, and extensions,” Annu. Rev. Psychol. 67, 641–666. https://doi.org/10.1146/annurev-psych-122414-033645, Google ScholarCrossref
  14. 11. Fujisaka, W., and Nishida, S. (2005). “ Temporal frequency characteristics of synchrony-asynchrony discrimination of audio-visual signals,” Exp. Brain Res. 166(3–4), 455–464. https://doi.org/10.1007/s00221-005-2385-8, Google ScholarCrossref
  15. 12. Gebhard, J. W., and Mowbray, G. H. (1959). “ On discriminating the rate of visual flicker and auditory flutter,” Am. J. Psychol. 72, 521–529. https://doi.org/10.2307/1419493, Google ScholarCrossref
  16. 12. Goldberg, H., Sun, Y., Hickey, T. J., Shinn-Cunnigham, B., and Sekuler, R. (2015). “ Policing fish at Boston's Museum of Science: Studying audiovisual interaction in the wild,” i-Perception 6(4), 1. https://doi.org/10.1177/2041669515599332, Google ScholarCrossref
  17. 13. Green, D. M., and Swets, J. A. (1966). Signal Detection Theory and Psychophysics ( Wiley, New York). Google Scholar
  18. 14. Hein, G., Doehrmann, O., Muller, N. G., Kaiser, J., Muckli, L., and Naumer, M. J. (2007). “ Object familiarity and semantic congruency modulate responses in cortical audiovisual integration areas,” J. Neurosci. 27, 7881–7887. https://doi.org/10.1523/JNEUROSCI.1740-07.2007, Google ScholarCrossref
  19. 15. Heitz, R. P. (2014). “ The speed-accuracy tradeoff: History, physiology, methodology, and behavior,” Front. Neurosci. 8, 150. https://doi.org/10.3389/fnins.2014.00150, Google ScholarCrossref
  20. 16. Herz, D. M., Zavala, B. A., Bogacz, R., and Brown, P. (2016). “ Neural correlates of decision thresholds in the human subthalamic nucleus,” Curr. Biol. 26, 916–920. https://doi.org/10.1016/j.cub.2016.01.051, Google ScholarCrossref
  21. 17. Hickey, T. J. (2013). fishgame, https://github.com/tjhickey724/fishgame (Last viewed January 4, 2017). Google Scholar
  22. 18. Hothorn, T., Bretz, F., Westfall, P., Heiberger, R. M., Schuetzenmeister, A., Scheibe, S., and Hothorn, M. T. (2016). Package “multcomp,” http://cran.stat.sfu.ca/web/packages/multcomp/multcomp.pdf (Last viewed February 21, 2017). Google Scholar
  23. 19. Hyndman, R. J. (2015). Package “hdrcde,” http://cran.stat.sfu.ca/web/packages/hdrcde/hdrcde.pdf (Last viewed February 21, 2017). Google Scholar
  24. 20. Koelewijn, T., Bronkhorst, A., and Theeuwes, J. (2010). “ Attention and the multiple stages of multisensory integration: A review of audiovisual studies,” Acta Psychol. (Amst.) 134, 372–384. https://doi.org/10.1016/j.actpsy.2010.03.010, Google ScholarCrossref
  25. 21. Kruschke, J. K. (2013). “ Bayesian estimation supersedes the t test,” J. Exp. Psychol. Gen. 142, 573–603. https://doi.org/10.1037/a0029146, Google ScholarCrossref
  26. 22. Kubovy, M., and Yu, M. (2012). “ Multistability, cross-modal binding and the additivity of conjoined grouping principles,” Philos. Trans. R. Soc. Lond. B Biol. Sci. 367, 954–964. https://doi.org/10.1098/rstb.2011.0365, Google ScholarCrossref
  27. 23. Leung, H.-C., Skudlarski, P., Gatenby, J. C., Peterson, B. S., and Gore, J. C. (2000). “ An event-related functional MRI study of the stroop color word interference task,” Cereb. Cortex 10, 552–560. https://doi.org/10.1093/cercor/10.6.552, Google ScholarCrossref
  28. 24. Luck, S. J., and Vogel, E. K. (1997). “ The capacity of visual working memory for features and conjunctions,” Nature 390, 279–281. https://doi.org/10.1038/36846, Google ScholarCrossref
  29. 25. Maddox, R. K., Atilgan, H., Bizley, J. K., and Lee, A. K. (2015). “ Auditory selective attention is enhanced by a task-irrelevant temporally coherent visual stimulus in human listeners,” Elife 4, e04995. https://doi.org/10.7554/eLife.04995, Google ScholarCrossref
  30. 26. Marks, L. E. (1987). “ On cross-modal similarity: Auditory-visual interactions in speeded discrimination,” J. Exp. Psychol. Hum. Percept. Perform. 13, 384–394. https://doi.org/10.1037/0096-1523.13.3.384, Google ScholarCrossref
  31. 27. Mathias, S. R. (2016). “ Unified analysis of accuracy and reaction times via models of decision making,” Proc. Mtgs. Acoust. 26, 050001. https://doi.org/10.1121/2.0000219, Google ScholarScitation
  32. 28. Matzke, D., and Wagenmakers, E.-J. (2009). “ Psychological interpretation of the ex-Gaussian and shifted Wald parameters: A diffusion model analysis,” Psychon. Bull. Rev. 16, 798–817. https://doi.org/10.3758/PBR.16.5.798, Google ScholarCrossref
  33. 29. Meredith, M. A., Nemitz, J. W., and Stein, B. E. (1987). “ Determinants of multisensory integration in superior colliculus neurons. I. Temporal factors,” J. Neurosci. 7, 3215–3229. Google ScholarCrossref
  34. 30. Meredith, M. A., and Stein, B. E. (1986). “ Visual, auditory, and somatosensory convergence on cells in superior colliculus results in multisensory integration,” J. Neurophysiol. 56, 640–662. Google ScholarCrossref
  35. 31. Michalka, S. W., Kong, L., Rosen, M. L., Shinn-Cunningham, B. G., and Somers, D. C. (2015). “ Short-term memory for space and time flexibly recruit complementary sensory-biased frontal lobe attention networks,” Neuron 87, 882–892. https://doi.org/10.1016/j.neuron.2015.07.028, Google ScholarCrossref
  36. 32. Milosavljevic, M., Malmaud, J., Huth, A., Koch, C., and Rangel, A. (2010). “ The drift diffusion model can account for the accuracy and reaction time of value-based choices under high and low time pressure,” Judgm. Decis. Mak. 5, 437–449. Google Scholar
  37. 32. Miranda, A. T., and Palmer, E. M. (2013). “ Intrinsic motivation and attentional capture from gamelike features in a visual search task,” Behav. Res. Methods 46(1), 159–172. https://doi.org/10.3758/s13428-013-0357-7, Google ScholarCrossref
  38. 33. Molholm, S., Martinez, A., Shpaner, M., and Foxe, J. J. (2007). “ Object-based attention is multisensory: Co-activation of an object's representations in ignored sensory modalities,” Eur. J. Neurosci. 26, 499–509. https://doi.org/10.1111/j.1460-9568.2007.05668.x, Google ScholarCrossref
  39. 34. Noppeney, U., Ostwald, D., and Werner, S. (2010). “ Perceptual decisions formed by accumulation of audiovisual evidence in prefrontal cortex,” J. Neurosci. Off. J. Soc. Neurosci. 30, 7434–7446. https://doi.org/10.1523/JNEUROSCI.0455-10.2010, Google ScholarCrossref
  40. 35. Parise, C. V., Spence, C., and Ernst, M. O. (2012). “ When correlation implies causation in multisensory integration,” Curr. Biol. 22, 46–49. https://doi.org/10.1016/j.cub.2011.11.039, Google ScholarCrossref
  41. 36. Patil, A., Huard, D., and Fonnesbeck, C. J. (2010). “ PyMC: Bayesian stochastic modelling in Python,” J. Stat. Softw. 35, 1. https://doi.org/10.18637/jss.v035.i04, Google ScholarCrossref
  42. 37. Ratcliff, R. (1978). “ A theory of memory retrieval,” Psychol. Rev. 85, 59–108. https://doi.org/10.1037/0033-295X.85.2.59, Google ScholarCrossref
  43. 38. Ratcliff, R., and Childers, R. (2015). “ Individual differences and fitting methods for the two-choice diffusion model of decision making,” Decision 2, 237–279. https://doi.org/10.1037/dec0000030, Google ScholarCrossref
  44. 39. Ratcliff, R., and McKoon, G. (2008). “ The diffusion decision model: Theory and data for two-choice decision tasks,” Neural Comput. 20, 873–922. https://doi.org/10.1162/neco.2008.12-06-420, Google ScholarCrossref
  45. 40. Ratcliff, R., and Rouder, J. N. (1998). “ Modeling response times for two-choice decisions,” Psychol. Sci. 9, 347–356. https://doi.org/10.1111/1467-9280.00067, Google ScholarCrossref
  46. 41. Recanzone, G. H. (2002). “ Auditory influences on visual temporal rate perception,” J. Neurophysiol. 89, 1078–1093. https://doi.org/10.1152/jn.00706.2002, Google ScholarCrossref
  47. 42. Shams, L., Kamitani, Y., and Shimojo, S. (2002). “ Visual illusion induced by sound,” Cogn. Brain Res. 14, 147–152. https://doi.org/10.1016/S0926-6410(02)00069-1, Google ScholarCrossref
  48. 43. Shinn-Cunningham, B. G. (2008). “ Object-based auditory and visual attention,” Trends Cogn. Sci. 12, 182–186. https://doi.org/10.1016/j.tics.2008.02.003, Google ScholarCrossref
  49. 44. Shipley, T. (1964). “ Auditory flutter-driving of visual flicker,” Science 145, 1328–1330. https://doi.org/10.1126/science.145.3638.1328, Google ScholarCrossref
  50. 45. Soto-Faraco, S., Lyons, J., Gazzaniga, M., Spence, C., and Kingstone, A. (2002). “ The ventriloquist in motion: Illusory capture of dynamic information across sensory modalities,” Cogn. Brain Res. 14, 139–146. https://doi.org/10.1016/S0926-6410(02)00068-X, Google ScholarCrossref
  51. 46. Soto-Faraco, S., Spence, C., and Kingstone, A. (2004). “ Cross-modal dynamic capture: Congruency effects in the perception of motion across sensory modalities,” J. Exp. Psychol. Hum. Percept. Perform. 30, 330–345. https://doi.org/10.1037/0096-1523.30.2.330, Google ScholarCrossref
  52. 47. Speckman, P. L., Rouder, J. N., Morey, R. D., and Pratte, M. S. (2008). “ Delta plots and coherent distribution ordering,” Am. Stat. 62, 262–266. https://doi.org/10.1198/000313008X333493, Google ScholarCrossref
  53. 48. Spence, C. (2011). “ Crossmodal correspondences: A tutorial review,” Atten. Percept. Psychophys. 73, 971–995. https://doi.org/10.3758/s13414-010-0073-7, Google ScholarCrossref
  54. 49. Spence, C., and Driver, J. (1997). “ On measuring selective attention to an expected sensory modality,” Percept. Psychophys. 59, 389–403. https://doi.org/10.3758/BF03211906, Google ScholarCrossref
  55. 50. Spence, C., and Squire, S. (2003). “ Multisensory integration: Maintaining the perception of synchrony,” Curr. Biol. 13, R519–R521. https://doi.org/10.1016/S0960-9822(03)00445-7, Google ScholarCrossref
  56. 51. Sun, Y., Shinn-Cunningham, B., Hickey, T. J., and Sekuler, R. (2016). “ Catching audiovisual interactions with a first-person fisherman video game,” Perception. in press. https://doi.org/10.1177/0301006616682755, Google ScholarCrossref
  57. 52. Talsma, D., Senkowski, D., Soto-Faraco, S., and Woldorff, M. G. (2010). “ The multifaceted interplay between attention and multisensory integration,” Trends Cogn. Sci. 14, 400–410. https://doi.org/10.1016/j.tics.2010.06.008, Google ScholarCrossref
  58. 53. Treisman, A. M., and Gelade, G. (1980). “ A feature-integration theory of attention,” Cognit. Psychol. 12, 97–136. https://doi.org/10.1016/0010-0285(80)90005-5, Google ScholarCrossref
  59. 53. Ulrich, R., Schröter, H., Leuthold, H., and Birngruber, T. (2015). “ Automatic and controlled stimulus processing in conflict tasks: Superimposed diffusion processes and delta functions,” Cogn. Psychol. 78, 148–174. https://doi.org/10.1016/j.cogpsych.2015.02.005, Google ScholarCrossref
  60. 54. van Veen, V., and Carter, C. S. (2002). “ The anterior cingulate as a conflict monitor: fMRI and ERP studies,” Physiol. Behav. 77, 477–482. https://doi.org/10.1016/S0031-9384(02)00930-7, Google ScholarCrossref
  61. 55. Vendrell, P., Junqué, C., Pujol, J., Jurado, M. A., Molet, J., and Grafman, J. (1995). “ The role of prefrontal regions in the Stroop task,” Neuropsychologia 33, 341–352. https://doi.org/10.1016/0028-3932(94)00116-7, Google ScholarCrossref
  62. 56. Voss, A., Nagler, M., and Lerche, V. (2013). “ Diffusion models in experimental psychology: A practical introduction,” Exp. Psychol. 60, 385–402. https://doi.org/10.1027/1618-3169/a000218, Google ScholarCrossref
  63. 57. Wagenmakers, E.-J. (2009). “ Methodological and empirical developments for the Ratcliff diffusion model of response times and accuracy,” Eur. J. Cogn. Psychol. 21, 641–671. https://doi.org/10.1080/09541440802205067, Google ScholarCrossref
  64. 57. Washburn, D. A. (2003). “ The games psychologists play (and the data they provide),” Behav. Res. Methods, Instrum., Comput. 35(2), 185–193. https://doi.org/10.3758/BF03202541, Google ScholarCrossref
  65. 58. Welch, R. B., and Warren, D. H. (1980). “ Immediate perceptual response to intersensory discrepancy,” Psychol. Bull. 88, 638–667. https://doi.org/10.1037/0033-2909.88.3.638, Google ScholarCrossref
  66. 59. White, C. N., Ratcliff, R., and Starns, J. J. (2011). “ Diffusion models of the flanker task: Discrete versus gradual attentional selection,” Cognit. Psychol. 63, 210–238. https://doi.org/10.1016/j.cogpsych.2011.08.001, Google ScholarCrossref
  67. 60. Wickelgren, W. A. (1977). “ Speed-accuracy tradeoff and information processing dynamics,” Acta Psychol. (Amst.) 41, 67–85. https://doi.org/10.1016/0001-6918(77)90012-9, Google ScholarCrossref
  68. 61. Wiecki, T. V., Sofer, I., and Frank, M. J. (2016). “ Stimulus coding with HDDMRegression — HDDM 0.6.0 documentation,” http://ski.clps.brown.edu/hddm_docs/tutorial_regression_stimcoding.html (Last viewed November 8, 2016). Google Scholar
  69. 62. Wiecki, T. V., Sofer, I., and Frank, M. J. (2013). “ HDDM: Hierarchical Bayesian estimation of the Drift-Diffusion Model in Python,” Front. Neuroinformatics 7, 14. https://doi.org/10.3389/fninf.2013.00014, Google ScholarCrossref
  1. © 2017 Acoustical Society of America.