One of 5 or so places in the brain that can get a dopamine burst when a bad thing happens (opposite of the usual) is closely tied to inferotemporal cortex (IT). I talked about it in "Example 2C" here - https://www.lesswrong.com/posts/jrewt3rLFiKWrKuyZ/big-picture-of-phasic-dopamine#Example_2C_…
One of 5 or so places in the brain that can get a dopamine burst when a bad thing happens (opposite of the usual) is closely tied to inferotemporal cortex (IT). I talked about it in "Example 2C" here - https://www.lesswrong.com/posts/jrewt3rLFiKWrKuyZ/big-picture-of-phasic-dopamine#Example_2C__Visual_attention Basically, as far as I can tell, IT is "making decisions" about what to attend to within the visual scene, and it's being rewarded NOT for "things are going well in life", but rather for "something scary or exciting is happening". So from IT's own narrow perspective, noticing the lion is very rewarding. (Amusingly, "noticing a lion" was the example in my blog post too!)
Turning to look at the lion is a type of "orienting reaction", I think. I'm not entirely sure of the details, but I think orienting reactions involve a network of brain regions one of which is IT. The superior colliculus (SC) is involved here too, and SC is ALSO not part of the "things are going well in life" RL system—in fact, SC is not even in the cortex at all, it's in the brainstem.
So yeah, basically, looking at the lion mostly "isn't reinforceable", or to the extent that it is "reinforceable", it's being reinforced by a different reward signal, one in which "scary" is good, as far as I understand right now.
Deciding to open an email, on the other hand, has basically nothing to do with IT or superior colliculus, but rather involves high-level decision-making (dorsolateral prefrontal cortex maybe?), and that bran region DOES get driven by the main "things are going well in life" reward signal.
Offering a clarification here: I don't believe that the IT cortex receives a dopamine burst when bad things happen. The paper you linked to in the Less Wrong post (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC38733/) correctly identifies that IT is an input to and output of the *basal ganglia*, which is a loop from cortex -> striatum -> nigra -> thalamus -> cortex. But that's not the same as saying that IT receives lots of dopamine input (it does not). So there isn't really a problem here in terms of dopamine training higher-order visual areas/rewarding when bad things happen. (The tail of the striatum, and potentially other aversive hotspots, is another question. Those regions tend to drive avoidance and not approach, so it's not really fair to say noticing the lion is rewarding in this case, despite phasic dopamine in these areas.)
Thanks! In my defense I didn't say that IT has a dopamine burst. What I said was, well, maybe it's a bit confusing out of a particular context. So here's the context.
There are a bunch of parallel cortico-basal ganglia-thalamocortical loops, throughout the cerebrum. I think for certain purposes, we should treat "one single loop" as a unit that is trained (by RL or supervised learning) to do one particular thing. See https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.379.6154&rep=rep1&type=pdf . In that paper, referring to their Fig. 5, they talk about "column-wise thinking" as the traditional way of thinking, where you say what does the striatum do? what does the cortex do? Etc. They are proposing an alternative, namely "row-wise thinking": What does this loop do? What does that loop do? And that's where this particular comment is coming from.
It turns out that IT and the tail of the caudate are looped up into the same cortico-basal ganglia-thalamocortical loops—IT is the cortex stop and tail of caudate is the striatum stop of the same loops. Therefore (in this way of thinking) if the tail of the caudate is getting aversive dopamine bursts, that's basically a mechanism for manipulating what IT will do.
You're still welcome to disagree with that perspective of course, but hopefully at least you understand it a bit now.
Yes, thank you for the clarification! Sorry if I misinterpreted your original comment. I'm a big fan of the "row-wise" thinking you espouse. However, I don't think there's good evidence either way for what effect DA has on the cortical sites to which BG loops return. Indeed, I don't think there's a good account for why these are loops at all! Certainly, the easiest way to think about it is that cortex is providing the "state" input to the RL algorithm, and DA is training the weights of corticostriatal synapses to compute value (or threat, in the case of the caudate tail). But there has been interesting work in RL asking how can we harness more information from RPE signals to train better state representations (see e.g. speculation in Dabney et al., Nature, 2020). Of course, in ML, you just backpropagate this information so it's no big deal. In the brain, backpropagation is tricky or impossible, so maybe the BG gets around this limitation by somehow forward-propagating the info all the way around the BG loop. But there isn't really a plausible account of how that would happen either. By the way, this is much more general than just IT cortex; primary visual and auditory cortex project to tail of striatum, motor and sensory cortex to dorsolateral striatum, etc. See Hintiryan et al., 2016 and Hunnicutt et al., 2016 for much more than you wanted to know.
tl;dr I think we don't yet know nearly enough to say what effect, if any, DA has on sensory cortices that provide input to the striatum.
Re. "IT is "making decisions" about what to attend to within the visual scene, and it's being rewarded NOT for "things are going well in life", but rather for "something scary or exciting is happening": I would be interested in anything ": Do you know of any brain systems which relate this to aesthetics (cast as preferences about what we attend to), curiosity, or the fun of problem-solving?
I pretty much agree with everything you said.
One of 5 or so places in the brain that can get a dopamine burst when a bad thing happens (opposite of the usual) is closely tied to inferotemporal cortex (IT). I talked about it in "Example 2C" here - https://www.lesswrong.com/posts/jrewt3rLFiKWrKuyZ/big-picture-of-phasic-dopamine#Example_2C__Visual_attention Basically, as far as I can tell, IT is "making decisions" about what to attend to within the visual scene, and it's being rewarded NOT for "things are going well in life", but rather for "something scary or exciting is happening". So from IT's own narrow perspective, noticing the lion is very rewarding. (Amusingly, "noticing a lion" was the example in my blog post too!)
Turning to look at the lion is a type of "orienting reaction", I think. I'm not entirely sure of the details, but I think orienting reactions involve a network of brain regions one of which is IT. The superior colliculus (SC) is involved here too, and SC is ALSO not part of the "things are going well in life" RL system—in fact, SC is not even in the cortex at all, it's in the brainstem.
So yeah, basically, looking at the lion mostly "isn't reinforceable", or to the extent that it is "reinforceable", it's being reinforced by a different reward signal, one in which "scary" is good, as far as I understand right now.
Deciding to open an email, on the other hand, has basically nothing to do with IT or superior colliculus, but rather involves high-level decision-making (dorsolateral prefrontal cortex maybe?), and that bran region DOES get driven by the main "things are going well in life" reward signal.
Offering a clarification here: I don't believe that the IT cortex receives a dopamine burst when bad things happen. The paper you linked to in the Less Wrong post (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC38733/) correctly identifies that IT is an input to and output of the *basal ganglia*, which is a loop from cortex -> striatum -> nigra -> thalamus -> cortex. But that's not the same as saying that IT receives lots of dopamine input (it does not). So there isn't really a problem here in terms of dopamine training higher-order visual areas/rewarding when bad things happen. (The tail of the striatum, and potentially other aversive hotspots, is another question. Those regions tend to drive avoidance and not approach, so it's not really fair to say noticing the lion is rewarding in this case, despite phasic dopamine in these areas.)
Thanks! In my defense I didn't say that IT has a dopamine burst. What I said was, well, maybe it's a bit confusing out of a particular context. So here's the context.
There are a bunch of parallel cortico-basal ganglia-thalamocortical loops, throughout the cerebrum. I think for certain purposes, we should treat "one single loop" as a unit that is trained (by RL or supervised learning) to do one particular thing. See https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.379.6154&rep=rep1&type=pdf . In that paper, referring to their Fig. 5, they talk about "column-wise thinking" as the traditional way of thinking, where you say what does the striatum do? what does the cortex do? Etc. They are proposing an alternative, namely "row-wise thinking": What does this loop do? What does that loop do? And that's where this particular comment is coming from.
It turns out that IT and the tail of the caudate are looped up into the same cortico-basal ganglia-thalamocortical loops—IT is the cortex stop and tail of caudate is the striatum stop of the same loops. Therefore (in this way of thinking) if the tail of the caudate is getting aversive dopamine bursts, that's basically a mechanism for manipulating what IT will do.
You're still welcome to disagree with that perspective of course, but hopefully at least you understand it a bit now.
Yes, thank you for the clarification! Sorry if I misinterpreted your original comment. I'm a big fan of the "row-wise" thinking you espouse. However, I don't think there's good evidence either way for what effect DA has on the cortical sites to which BG loops return. Indeed, I don't think there's a good account for why these are loops at all! Certainly, the easiest way to think about it is that cortex is providing the "state" input to the RL algorithm, and DA is training the weights of corticostriatal synapses to compute value (or threat, in the case of the caudate tail). But there has been interesting work in RL asking how can we harness more information from RPE signals to train better state representations (see e.g. speculation in Dabney et al., Nature, 2020). Of course, in ML, you just backpropagate this information so it's no big deal. In the brain, backpropagation is tricky or impossible, so maybe the BG gets around this limitation by somehow forward-propagating the info all the way around the BG loop. But there isn't really a plausible account of how that would happen either. By the way, this is much more general than just IT cortex; primary visual and auditory cortex project to tail of striatum, motor and sensory cortex to dorsolateral striatum, etc. See Hintiryan et al., 2016 and Hunnicutt et al., 2016 for much more than you wanted to know.
tl;dr I think we don't yet know nearly enough to say what effect, if any, DA has on sensory cortices that provide input to the striatum.
Re. "IT is "making decisions" about what to attend to within the visual scene, and it's being rewarded NOT for "things are going well in life", but rather for "something scary or exciting is happening": I would be interested in anything ": Do you know of any brain systems which relate this to aesthetics (cast as preferences about what we attend to), curiosity, or the fun of problem-solving?