Turing 2018/7: Blockhead, the Chinese Room, and ELIZA

1
00:00:02,160 --> 00:00:09,000
Last time we went through Alan Turing's 1950 paper on the Turing test.

2
00:00:09,000 --> 00:00:20,310
And today we're going to be looking a bit more at that, but mainly at some related issues to do with some other thought experiments and a

3
00:00:20,310 --> 00:00:26,910
kind of software that cast serious doubt on the Turing test in the last lecture.

4
00:00:26,910 --> 00:00:37,200
Next time, I'm going to be trying to bring threads together, looking at how Turing and Cell match up in this lecture, we're going to see.

5
00:00:37,200 --> 00:00:44,930
John Searle getting the upper hand, but I promise you that will be redressed.

6
00:00:44,930 --> 00:00:55,130
So first of all, a thought experiment from Net Block, very aptly called Blockhead, though I don't think he gave it that name.

7
00:00:55,130 --> 00:01:03,260
So the idea is that the Turing test cannot be a good test of intelligence because you can imagine you can

8
00:01:03,260 --> 00:01:11,330
do a thought experiment of a system that would pass the Turing test and yet very clearly be mindless,

9
00:01:11,330 --> 00:01:23,870
unintelligent. So let's suppose we imagine every possible sensible conversation of a given length being stored.

10
00:01:23,870 --> 00:01:31,370
So humongous range of all the possible sensible conversations that you could have in English.

11
00:01:31,370 --> 00:01:36,830
And imagine then that we have a system which goes through the ease.

12
00:01:36,830 --> 00:01:43,370
And when it gets an input that presumably a sensible input from its interlocutor,

13
00:01:43,370 --> 00:01:51,890
it looks through all the sensible conversations until it finds one that matches and then it gives the appropriate response.

14
00:01:51,890 --> 00:02:00,980
Now, clearly, if that were to be done, then what comes from the machine will be a sensible conversation.

15
00:02:00,980 --> 00:02:10,700
So in that case, it would pass the Turing test, but nevertheless, it clearly isn't an intelligent system because it's just a gigantic look up table.

16
00:02:10,700 --> 00:02:18,020
There's no understanding there. We can't call it intelligent. All right.

17
00:02:18,020 --> 00:02:23,750
Let's just, first of all, do a little reality check on this thought experiment.

18
00:02:23,750 --> 00:02:29,510
One thing I'd advise you quite generally when you come across a thought experiment in philosophy and there are lots of them.

19
00:02:29,510 --> 00:02:35,570
It's always worth asking, is this the least bit plausible?

20
00:02:35,570 --> 00:02:42,290
Because if it isn't, that might cast doubt on it. Okay, so let's just so that we've got some numbers to play with.

21
00:02:42,290 --> 00:02:52,610
Imagine that we're thinking of a conversation which has 10 sentences, each one from the of one from the respondent.

22
00:02:52,610 --> 00:02:58,820
And imagine that on average, each of these sentences is 10 words.

23
00:02:58,820 --> 00:03:04,040
Let's suppose that on average, each word can be chosen from a menu of about 100 choices.

24
00:03:04,040 --> 00:03:09,140
Right? I mean, there are thousands, tens of thousands of words in English, so we're being pretty modest here.

25
00:03:09,140 --> 00:03:14,030
But let's just get a ballpark figure. How many conversations does that generate?

26
00:03:14,030 --> 00:03:23,120
Well, we've got 20 sentences, each involving 10 words and each of those words chosen from a menu of 100 choices.

27
00:03:23,120 --> 00:03:33,950
So the number of conversations here is about 100 to the 200, which is 10 to the 400.

28
00:03:33,950 --> 00:03:41,990
Now, the number of atoms in the universe is probably around about 10 to the 80.

29
00:03:41,990 --> 00:03:47,900
So imagine replacing every atom in the main in the known universe with a complete universe.

30
00:03:47,900 --> 00:03:52,640
How many atoms do we now have? Well, 10 to 160.

31
00:03:52,640 --> 00:04:01,760
Imagine doing the same thing three more times, taking every atom and replacing it with a complete universe.

32
00:04:01,760 --> 00:04:06,260
When you've done that, a total of four times you have 10 to the 400 atoms.

33
00:04:06,260 --> 00:04:11,000
That's how many conversations we're talking about. OK.

34
00:04:11,000 --> 00:04:16,940
So it's not that this thought experiment is just a little bit implausible.

35
00:04:16,940 --> 00:04:31,430
It's absolutely outrageously implausible. And we're only talking here about a conversation that involves 10 questions and replies, OK.

36
00:04:31,430 --> 00:04:38,450
Now, suppose you imagine a system that exhibits some impressive behaviour.

37
00:04:38,450 --> 00:04:47,720
Is that enough to attribute it with intelligence? Well, we could mean two things here that there could be two different questions we're asking.

38
00:04:47,720 --> 00:04:52,700
On the one hand, we might be asking whether that behaviour is definitive of intelligence.

39
00:04:52,700 --> 00:05:00,830
So anything that behaves like that is correctly described as intelligent just in virtue of that behaviour.

40
00:05:00,830 --> 00:05:11,150
That's one thing we could mean, but the other thing we could mean is that the behaviour provides strong evidence of intelligence

41
00:05:11,150 --> 00:05:16,130
because the behaviour could only plausibly be generated by something that is intelligent.

42
00:05:16,130 --> 00:05:22,340
And those are two different things, right? One says this behaviour is, by definition, intelligent.

43
00:05:22,340 --> 00:05:26,780
This one says it's strong evidence of intelligence.

44
00:05:26,780 --> 00:05:35,030
Now, when we come across thought experiments like Blockhead, I want to suggest that our intuitive reaction to it.

45
00:05:35,030 --> 00:05:39,350
You know, we say, Oh, that clearly isn't intelligent. It's just a giant look up table.

46
00:05:39,350 --> 00:05:48,290
All right. That reaction suggests that we're not thinking of intelligence as merely to be measured by the appropriate behaviour.

47
00:05:48,290 --> 00:05:58,020
But actually what we want is behaviour that generated cleverly with limited resources, not just a look up table, but notice in any way.

48
00:05:58,020 --> 00:06:05,450
But anyway, in practise, the two comes at the same thing. I thought experiments can float free of any plausible reality.

49
00:06:05,450 --> 00:06:10,880
But if we look in practise, Blockhead just could not exist.

50
00:06:10,880 --> 00:06:14,600
It's not a plausible thought experiment. So in practise,

51
00:06:14,600 --> 00:06:21,680
in order to generate the kind of behaviour that we think of as if it's strong evidence of intelligence

52
00:06:21,680 --> 00:06:28,610
that could only come about by having some pretty nifty processing underneath some clever processing,

53
00:06:28,610 --> 00:06:32,540
at least it seems that way.

54
00:06:32,540 --> 00:06:39,890
So I want to suggest that the bloc had thought experiment actually shouldn't be given too much weight and to press home that point,

55
00:06:39,890 --> 00:06:45,440
I want to give you a different thought experiment.

56
00:06:45,440 --> 00:06:56,150
Suppose I provide you with a chess playing programme and it plays chess at grandmaster level in real time.

57
00:06:56,150 --> 00:07:01,760
Would you say that shows that it's at least intelligent at playing chess?

58
00:07:01,760 --> 00:07:07,160
Well, you might think so, but here's a thought experiment on the other side.

59
00:07:07,160 --> 00:07:14,630
Suppose someone were to write a computer programme of only about 50 lines of code in a standard general programming language?

60
00:07:14,630 --> 00:07:18,590
All right, no packages that you can import or anything like that.

61
00:07:18,590 --> 00:07:23,090
So 50 lines of code which could play chess at a Grandmaster 11 in real time.

62
00:07:23,090 --> 00:07:26,990
Such a crude programme could not possibly count as genuinely intelligent.

63
00:07:26,990 --> 00:07:37,280
I mean, I ask you 50 lines of code. All right. Hence, Grandmaster performance chess is not a reliable proof, even if intelligent chess play.

64
00:07:37,280 --> 00:07:44,360
What I want to suggest is that's a rubbish argument. It's a rubbish argument because the hypothesis that it's based on.

65
00:07:44,360 --> 00:07:53,360
But you could even have a computer programme that plays grandmaster chess in real time and only 50 lines long is plainly crazy.

66
00:07:53,360 --> 00:07:58,200
So we should not be persuaded by this kind of argument. I mean, you can see you could make that argument.

67
00:07:58,200 --> 00:08:08,270
I've chosen chess, but you could make it about any domain. So what I'm suggesting to you is we should be very sceptical about thought experiments

68
00:08:08,270 --> 00:08:14,180
that are just completely beyond any bounds of plausibility that they come to cheaply.

69
00:08:14,180 --> 00:08:20,660
You can invent one like this far too easily, and it appeals to our intuitions in certain ways.

70
00:08:20,660 --> 00:08:31,940
But I'm suggesting that our intuitions shouldn't be allowed to be pulled too much by thought experiments that are so far from reality.

71
00:08:31,940 --> 00:08:37,250
Very nice quote here from Dan Dennett talking about thought experiments.

72
00:08:37,250 --> 00:08:38,810
If you look at the history of philosophy,

73
00:08:38,810 --> 00:08:45,440
you see that all the great and influential stuff has been technically full of holes, but utterly memorable and vivid.

74
00:08:45,440 --> 00:08:51,290
I think he's slightly exaggerating, but not all the great and influential stuff, but some of it.

75
00:08:51,290 --> 00:08:59,840
They are what I call intuition pumps, lovely thought experiments like Plato's Cave and Descartes Evil Demon and Hobbes edition of the

76
00:08:59,840 --> 00:09:05,510
State of Nature in the social contract and even Kemp's idea of the categorical imperative.

77
00:09:05,510 --> 00:09:11,720
I don't know of any philosopher who thinks that any one of those is a logically sound argument for anything.

78
00:09:11,720 --> 00:09:18,860
I suspect there are some who do, but their wonderful imagination grabbers, jungle gyms for our imagination.

79
00:09:18,860 --> 00:09:25,470
They structure the way you think about a problem. These are the real legacy of the history of philosophy.

80
00:09:25,470 --> 00:09:34,430
Okay. So I think there's a lot in this. I mean, the idea of a thought experiment is as pumping our intuitions is a very strong one.

81
00:09:34,430 --> 00:09:42,320
But let's look a bit more closely at what that's doing. The thought experiments are trying to illuminate one thing, for example,

82
00:09:42,320 --> 00:09:49,070
the nature of the world or the social order by harnessing our familiar understanding of something else.

83
00:09:49,070 --> 00:09:59,510
So Plato's analogy of the cave, you've got shadows cast by a fire, and that's supposed to tell us something about the way the world is.

84
00:09:59,510 --> 00:10:10,790
Social contract theories suggest that our interactions in the social order are somewhat similar to making an explicit contract.

85
00:10:10,790 --> 00:10:15,020
So we're encouraged to see one thing as relevantly similar to the other.

86
00:10:15,020 --> 00:10:21,320
And that can pull our judgements in particular directions regarding that thing.

87
00:10:21,320 --> 00:10:26,570
The same would be true of computer analogies, so obviously comparing the mind to a computer programme.

88
00:10:26,570 --> 00:10:31,700
Richard Dawkins gives another example he compares a religion to a computer virus.

89
00:10:31,700 --> 00:10:35,720
He suggests that religious beliefs have certain characteristics which make

90
00:10:35,720 --> 00:10:44,030
them spread in a very fertile way in the mind without a critical examination.

91
00:10:44,030 --> 00:10:49,340
But the problem is that different analogies can easily suggest quite different conclusions.

92
00:10:49,340 --> 00:10:55,970
So the problem with thought experiments is that they don't necessarily point as ambiguous

93
00:10:55,970 --> 00:11:02,690
unambiguously in a single direction or a direction that is necessarily merited.

94
00:11:02,690 --> 00:11:10,310
OK, we'll be coming back to this later. But now I want to introduce the thought experiment, which is nearly as famous as the Turing test.

95
00:11:10,310 --> 00:11:18,290
John Sayles Chinese room OK, so we imagine a conversation conducted in written Chinese.

96
00:11:18,290 --> 00:11:30,170
It's very like the Turing test, except that the participant in the Turing test, the one who's giving the responses, is somebody enclosed in a room.

97
00:11:30,170 --> 00:11:36,760
All right. And like me, who understands English, knows no Chinese at all.

98
00:11:36,760 --> 00:11:40,690
But the conversation is in fact conducted in Chinese,

99
00:11:40,690 --> 00:11:49,720
so we have a Chinese speaker who writes questions on one side of a card and posts them into the room and then

100
00:11:49,720 --> 00:11:59,170
the person in the room that's me has to consult various rulebooks to decide how to process what's written down.

101
00:11:59,170 --> 00:12:06,370
And then I write down on the other side of the card some symbols which turn out to be Chinese symbols,

102
00:12:06,370 --> 00:12:11,110
giving meaningful answers to the question that's been posted in.

103
00:12:11,110 --> 00:12:17,200
Okay, so the guy in the room that's me has no knowledge, whatever of the Chinese language or the meaning,

104
00:12:17,200 --> 00:12:22,900
the significance of the symbols he's reading or writing. So that's the word semantics here.

105
00:12:22,900 --> 00:12:30,700
But as far as I'm concerned, when I see these symbols, they have no semantics, no meaning for me.

106
00:12:30,700 --> 00:12:37,120
Instead, on generating my written answers by strictly applying rules based purely on the syntax that is the shape and

107
00:12:37,120 --> 00:12:45,580
the structure of the character strings that come in and so gives a very helpful example here of what he means.

108
00:12:45,580 --> 00:12:53,740
Take a squiggle squiggles sign out of basket number one and put it next to a squiggle squiggle sign from basket number two.

109
00:12:53,740 --> 00:13:00,460
I've been unable to discern from Chinese friends exactly which symbol is the squiggle squiggle sign or the scroll scroll,

110
00:13:00,460 --> 00:13:07,300
but maybe get his cartoon from Wicked.

111
00:13:07,300 --> 00:13:12,340
Wikimedia Commons Commons showing a very simplified version of the Chinese room.

112
00:13:12,340 --> 00:13:15,280
Notice that this is completely ridiculous.

113
00:13:15,280 --> 00:13:23,290
We've got individual symbols coming in and going out, and the guy is consulting a single rulebook which looks extremely crude.

114
00:13:23,290 --> 00:13:30,010
There's no way that's going to produce meaningful answers in a Turing test like situation, right?

115
00:13:30,010 --> 00:13:34,360
When the questions coming in could be many and varied.

116
00:13:34,360 --> 00:13:43,450
Vladimir, whose computer science philosophy student at Harvard, has kindly produced an illustration which is a little bit more realistic.

117
00:13:43,450 --> 00:13:49,780
So here's the guy in the Chinese room income. The questions outgo his answers.

118
00:13:49,780 --> 00:13:55,180
He's got a huge library of books. He's identifying the symbols.

119
00:13:55,180 --> 00:14:04,390
This book is telling him right down to this other book page such and such, and you can see that he's got a box of counters there,

120
00:14:04,390 --> 00:14:10,930
and he's constructing the symbols over here, and there are directions to other various rooms.

121
00:14:10,930 --> 00:14:19,060
And here there is a map of the whole layout just making clear that we've got an absolutely huge building.

122
00:14:19,060 --> 00:14:22,960
And maybe, maybe that's a little bit more plausible.

123
00:14:22,960 --> 00:14:28,870
Well, it's got a lot more plausible than that. Whether it's actually plausible.

124
00:14:28,870 --> 00:14:30,730
Well, maybe, maybe not.

125
00:14:30,730 --> 00:14:40,930
But it's a it's a far better step of the kind of sophistication that would be required to fit the Chinese room to get off the ground.

126
00:14:40,930 --> 00:14:52,570
Okay, so just a point. The original version of the Chinese room is different from from the well-known one, so actually started out in 1980,

127
00:14:52,570 --> 00:15:00,910
giving a context in which the questions were limited to understanding of a story told in Chinese.

128
00:15:00,910 --> 00:15:08,080
So a story is produced in Chinese, and then what the guy has to do is answer questions about it.

129
00:15:08,080 --> 00:15:16,150
So far more limited than a Turing test, but in 1984, so emboldened went beyond that.

130
00:15:16,150 --> 00:15:22,060
And basically we have a Turing test conducted through the Chinese room.

131
00:15:22,060 --> 00:15:27,410
OK. What he sells conclusion from all this.

132
00:15:27,410 --> 00:15:32,530
Well, clearly the man in the room that's me does not understand Chinese, OK?

133
00:15:32,530 --> 00:15:42,080
Nobody's going to argue that I in the room operating or doing all these symbols without a clue what any of the symbols mean understands Chinese.

134
00:15:42,080 --> 00:15:47,830
I clearly don't. But I am producing meaningful replies.

135
00:15:47,830 --> 00:15:56,860
Moral understanding a language or indeed having mental states at all involves more than just having a bunch of formal symbols.

136
00:15:56,860 --> 00:16:02,110
It involves having an interpretation or a meaning attached to those symbols.

137
00:16:02,110 --> 00:16:07,450
Computer programmes like the rules, followed by the man in the room, are purely formally specified.

138
00:16:07,450 --> 00:16:16,580
That is, they have no semantic content. But in fact, Seoul's conclusion,

139
00:16:16,580 --> 00:16:22,250
or you know what it is that he's denying he's actually not so easy to pin down and

140
00:16:22,250 --> 00:16:25,790
Searle is rather slippery on this will come back to this in the next lecture, we'll see.

141
00:16:25,790 --> 00:16:32,990
There's a there's a reason why he's somewhat slippery. But here I want to draw your attention to it.

142
00:16:32,990 --> 00:16:40,730
So most of the time, he expresses his thesis as a denial of intentionality or semantic content.

143
00:16:40,730 --> 00:16:48,470
Intentionality, by the way, is and the way in which words and thoughts reach out to things in the world.

144
00:16:48,470 --> 00:16:54,110
So suppose we take the word tree in Chinese.

145
00:16:54,110 --> 00:16:58,850
If you're a Chinese speaker for you, that word has intentionality.

146
00:16:58,850 --> 00:17:03,110
You see the word and you think of trees out there in the world.

147
00:17:03,110 --> 00:17:07,580
But for me, if I see that symbol, it has no such intentionality.

148
00:17:07,580 --> 00:17:14,590
I see the symbol. I see it shape, but I've no idea what it refers to.

149
00:17:14,590 --> 00:17:20,200
OK, so Seoul denies intentionality to the Chinese room semantic content,

150
00:17:20,200 --> 00:17:28,030
but he also denies that the digital machines can have a mind mental states, mental content, cognitive states, cognitive processes.

151
00:17:28,030 --> 00:17:37,270
And he describes his argument as attacking the claim of strong artificial intelligence that digital machines can think or have consciousness.

152
00:17:37,270 --> 00:17:45,160
Now that is a very wide range of claims that seem to me to be potentially very different.

153
00:17:45,160 --> 00:17:51,670
I've given some citations there so you can follow them up.

154
00:17:51,670 --> 00:18:03,970
His most cautious interpretation is to say that what cannot happen is that digital computers have semantic state,

155
00:18:03,970 --> 00:18:10,570
meaningful state purely in virtue of following a symbolic algorithm.

156
00:18:10,570 --> 00:18:14,650
So that seems a relatively modest claim.

157
00:18:14,650 --> 00:18:22,270
OK, so that the claim is here I am in the Chinese room processing these symbols.

158
00:18:22,270 --> 00:18:30,940
I am just following a symbolic algorithm there, and purely in virtue of that,

159
00:18:30,940 --> 00:18:40,960
my the state that I am manipulating, whether in my mind or in the room, have no semantic significance.

160
00:18:40,960 --> 00:18:49,660
If they have any semantic significance, they have going to have to acquire it from something more than the formal algorithm.

161
00:18:49,660 --> 00:18:56,680
But he often goes beyond that as we'll see one point, by the way.

162
00:18:56,680 --> 00:19:05,920
Just like Alan Turing in his paper, do you remember when he said we exclude men born in the usual way in exactly the same way?

163
00:19:05,920 --> 00:19:12,340
So is does not want to say that machines can't think because we are machines and we can think,

164
00:19:12,340 --> 00:19:20,290
at least in his book, but it's digital computers again that can't think.

165
00:19:20,290 --> 00:19:31,330
And here he's clarifying in virtue of following an algorithm. OK, I'm not going to look at two main replies to cells thought experiment.

166
00:19:31,330 --> 00:19:34,810
Say we'll be coming back to this next time and digging in a bit deeper.

167
00:19:34,810 --> 00:19:43,360
But for now, I think these are the two most popular replies, and they're ones that I think particularly important.

168
00:19:43,360 --> 00:19:48,820
So here's the system reply. OK. The man in the room, you know, I'm processing all these things.

169
00:19:48,820 --> 00:19:52,390
I don't understand Chinese. That's uncontroversial.

170
00:19:52,390 --> 00:19:59,830
But now think about the room itself, the room containing me containing the books containing the symbols containing the,

171
00:19:59,830 --> 00:20:11,050
you know, all the rules that whole system enclosed in the room is actually handling Chinese in an intelligent way.

172
00:20:11,050 --> 00:20:16,090
Questions are coming in. Intelligent responses are going out.

173
00:20:16,090 --> 00:20:25,750
Maybe there's understanding that even if I in the room as if you like the central processor unit of the room, I don't have understanding of Chinese.

174
00:20:25,750 --> 00:20:38,630
Nevertheless, the whole system does. I'm so, so rebuts that he says this is subject to exactly the same objection.

175
00:20:38,630 --> 00:20:42,500
There is no way that the system can get from the syntax to the semantics.

176
00:20:42,500 --> 00:20:51,890
I, as the central processing unit, have no way of figuring out what any of these symbols means, but then neither does the whole system.

177
00:20:51,890 --> 00:21:01,670
Okay. For the moment, I want to just point out and I'm and this is a reference to Copeland's very useful book,

178
00:21:01,670 --> 00:21:11,090
which I've mentioned before in these lectures. He points out that sells rebuttal here just begs the question, right?

179
00:21:11,090 --> 00:21:21,380
As a matter of logic, the fact that the man in the room doesn't understand Chinese does not prove that the room doesn't understand Chinese.

180
00:21:21,380 --> 00:21:27,140
He gives the the following argument bill the cleaner has never sold pyjamas to Korea.

181
00:21:27,140 --> 00:21:32,180
Therefore, Bill's company has never sold pyjamas to Korea. OK, that's obviously a bad argument.

182
00:21:32,180 --> 00:21:39,350
Okay. The point is that a component of a system can lack a property that the whole system has.

183
00:21:39,350 --> 00:21:45,440
So the fact that the man doesn't understand Chinese as a matter of logic does not

184
00:21:45,440 --> 00:21:51,340
imply that the whole system of which he's a part doesn't understand Chinese.

185
00:21:51,340 --> 00:21:58,100
So if Sir wants to rebut the system reply, he has to give a more positive argument.

186
00:21:58,100 --> 00:22:02,930
Of course, the Chinese room still remains false an intuition pump because you might be thinking,

187
00:22:02,930 --> 00:22:08,690
you know, Okay, Copeland, you've pointed out this footlong response.

188
00:22:08,690 --> 00:22:12,410
You know that the argument isn't strictly logically valid that come on,

189
00:22:12,410 --> 00:22:17,540
we all know that a room can't understand anything, you know, in that thought experiment.

190
00:22:17,540 --> 00:22:25,970
The only plausible understand is the man. So if he doesn't understand Chinese, then nothing there understands Chinese.

191
00:22:25,970 --> 00:22:29,150
Yeah, OK.

192
00:22:29,150 --> 00:22:37,160
But if that's the way you're going to argue, you are putting a lot of weight on the thought experiment, which, as we said, is a very implausible one.

193
00:22:37,160 --> 00:22:44,480
OK. It seems very strange to invent a thought experiment which is so distant from reality

194
00:22:44,480 --> 00:22:51,050
and then appeal ultimately just to rather basic intuitions about it in that way.

195
00:22:51,050 --> 00:23:02,390
We want to actually have a more forceful argument. OK, the second reply very popular one is the robot reply.

196
00:23:02,390 --> 00:23:09,950
So bear in mind that so is putting a lot of emphasis on what he calls semantic content or intentionality.

197
00:23:09,950 --> 00:23:14,090
And I said intentionality is to do with symbols being able to, as it were,

198
00:23:14,090 --> 00:23:21,440
reach out to the world to refer to things if we have intentional states like beliefs and desires.

199
00:23:21,440 --> 00:23:26,150
These are beliefs about things, desires for things.

200
00:23:26,150 --> 00:23:33,000
They're intentional in that they refer beyond us to things outside.

201
00:23:33,000 --> 00:23:41,780
OK, now suppose then we take a system like the Chinese room and imagine that it's actually embedded in the world.

202
00:23:41,780 --> 00:23:50,420
So instead of just having input and output through a through the slots, you know, with questions coming in and answers going back,

203
00:23:50,420 --> 00:23:59,420
suppose the room were actually connected up to robotic sensors and effectors so that

204
00:23:59,420 --> 00:24:06,110
the output from the room instead of just being some written symbols in Chinese.

205
00:24:06,110 --> 00:24:13,190
Instead, the output actually is movements, actions and so forth.

206
00:24:13,190 --> 00:24:22,160
Well. So it's essentially the same reply as long as we suppose that the robot has only a computer for a brain or,

207
00:24:22,160 --> 00:24:27,620
you know, the man in the Chinese room being here, the analogue for the computer.

208
00:24:27,620 --> 00:24:31,040
Then, even though it might behave exactly as if you'd understood Chinese,

209
00:24:31,040 --> 00:24:35,990
it would still have no way of getting from the syntax to the semantics of Chinese.

210
00:24:35,990 --> 00:24:41,930
You can see this if you imagine that I am the computer inside a room in the robot's skull.

211
00:24:41,930 --> 00:24:46,970
I shuffle symbols without knowing that some of them come into me from television cameras

212
00:24:46,970 --> 00:24:52,460
attached to the robot's head and others go out to move the robot's arms and legs.

213
00:24:52,460 --> 00:25:00,380
As long as all I have is a formal computer programme. I have no way of attaching any meaning to any of the symbols.

214
00:25:00,380 --> 00:25:05,930
OK. And again, that probably seems quite plausible. There I am in the Chinese room.

215
00:25:05,930 --> 00:25:09,410
I'm still in the same position. I mean, I'm getting these symbols in.

216
00:25:09,410 --> 00:25:16,700
I'm putting symbols out the fact that those are acting as inputs to some motors,

217
00:25:16,700 --> 00:25:25,880
which are moving the robot around in the world and doing actual things, simulating a comprehending of Chinese.

218
00:25:25,880 --> 00:25:30,010
You know, imagine the robot is actually interacting with Chinese people.

219
00:25:30,010 --> 00:25:41,890
Where in the world? And imagine that the symbols that come into me are being taken from microphones and so forth involving Chinese people.

220
00:25:41,890 --> 00:25:46,750
I there in the robot skull have no knowledge of that at all.

221
00:25:46,750 --> 00:25:56,720
So again, it seems that the symbols have no semantic content, at least again, that's the claim.

222
00:25:56,720 --> 00:25:59,960
So we're going to return to this in the last lecture for now.

223
00:25:59,960 --> 00:26:06,590
I just want you to notice the intimate connexion between cells, Johnny's room and the Turing Test.

224
00:26:06,590 --> 00:26:10,910
Both of them postulate an algorithmic system capable of generating conversation

225
00:26:10,910 --> 00:26:15,050
that's indistinguishable in quality from that of an intelligent native speaker.

226
00:26:15,050 --> 00:26:21,980
But they draw opposite conclusions. So Turing, in effect, is saying something like this.

227
00:26:21,980 --> 00:26:27,350
Think of the the sonnet example in his paper.

228
00:26:27,350 --> 00:26:31,610
Imagine a computer programme that's able to converse like this.

229
00:26:31,610 --> 00:26:37,940
How could you possibly deny that it's genuinely intelligent? It seems quite plausible, right?

230
00:26:37,940 --> 00:26:44,300
Which Chip Searle is saying? Imagine a computer programme that conducts its conversation using crudely

231
00:26:44,300 --> 00:26:49,670
syntactic processes like this the man in the room with the baskets and the rolls.

232
00:26:49,670 --> 00:26:53,660
How could you possibly claim that it's genuinely intelligent?

233
00:26:53,660 --> 00:27:02,840
So here we've got, you know, two thought experiments, one of them pulling us in one direction,

234
00:27:02,840 --> 00:27:08,990
one of us pulling in the other two contrary intuition pumps.

235
00:27:08,990 --> 00:27:14,540
Well, just like with Blockhead, let's go back to reality.

236
00:27:14,540 --> 00:27:19,520
As I say, we'll be coming back to this debate between them in the last lecture.

237
00:27:19,520 --> 00:27:25,850
But what I want to do right now is think about the plausibility of the thought experiments.

238
00:27:25,850 --> 00:27:32,720
Well, first of all, the Chinese room is completely and utterly implausible.

239
00:27:32,720 --> 00:27:41,930
Sophisticated linguistic behaviour being generated in real time by manually consulting books of rules contained in a room.

240
00:27:41,930 --> 00:27:51,110
No scope for sensory input, real time updating emotional reactions, let alone the complexity of the whole task.

241
00:27:51,110 --> 00:27:56,450
Interestingly, Turing's predictions in his paper are far more reasonable.

242
00:27:56,450 --> 00:28:05,900
So these are from section six of his paper. I drew your attention to them last time, but let's look in a bit more detail at them now.

243
00:28:05,900 --> 00:28:07,730
I believe that in about 50 years time,

244
00:28:07,730 --> 00:28:15,230
it will be possible to programme computers with a storage capacity of about a billion to make them play The Imitation Game so well that

245
00:28:15,230 --> 00:28:24,380
an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning.

246
00:28:24,380 --> 00:28:30,710
Second prediction the original question Can machines think, I believe, to be too meaningless to deserve discussion?

247
00:28:30,710 --> 00:28:37,970
Nevertheless, I believe that at the end of the century, the use of words and general educated opinion will have altered so much that one

248
00:28:37,970 --> 00:28:43,470
will be able to speak of machines thinking without expecting to be contradicted.

249
00:28:43,470 --> 00:28:54,650
OK, so I think the second prediction is actually quite reasonable.

250
00:28:54,650 --> 00:28:59,600
Certainly now imagine I'm playing against a computer playing chess,

251
00:28:59,600 --> 00:29:07,900
I'm looking at the screen and the screen is showing me the analysis that's currently going on within the computer.

252
00:29:07,900 --> 00:29:13,700
It shows me the lines that are being considered and it shows me the balance of material.

253
00:29:13,700 --> 00:29:22,460
And it shows me the overall verdict. So you come up and you say, Peter, why is the computer taking so long to respond?

254
00:29:22,460 --> 00:29:26,780
And I say it's thinking hard because it's realised that if he tries to defend

255
00:29:26,780 --> 00:29:33,590
against my attack by bringing its night over to protect the King Knight F6,

256
00:29:33,590 --> 00:29:39,770
I'll be able to grab its pawn on the other side. The Queen takes 87 percent.

257
00:29:39,770 --> 00:29:43,220
It's displaying now that it assesses the position is better for me.

258
00:29:43,220 --> 00:29:49,730
Materially, there's that minus one that it's got, you know, I'm going to be a pawn up,

259
00:29:49,730 --> 00:29:55,670
but it's predicting that it won't be too badly off it if it decides to let the pawn fall.

260
00:29:55,670 --> 00:30:04,760
Overall assessment minus nought point one. It's pulled down, but only minus nought point one, so it must think that it's got 0.9 of activity.

261
00:30:04,760 --> 00:30:13,310
So I think it must be expecting to get some activity to compensate. And what I want to claim is that conversation looks quite natural.

262
00:30:13,310 --> 00:30:18,350
All right. All of those words in red that are psychological words.

263
00:30:18,350 --> 00:30:24,620
Nowadays, we do apply them to computer systems without almost a second thought.

264
00:30:24,620 --> 00:30:31,160
Certainly, it would be. I would be very surprised if someone was so pedantic as to say, Well, of course,

265
00:30:31,160 --> 00:30:35,960
it's not literally thinking, it's not literally realise anything, it's not literally trying to do anything.

266
00:30:35,960 --> 00:30:42,950
And so the language seems very natural. In a computational context these days.

267
00:30:42,950 --> 00:30:55,160
So our thinking about these things has modified what about the other prediction that the Turing test by 2000?

268
00:30:55,160 --> 00:31:00,350
And will at least have been passed to this rather modest extent.

269
00:31:00,350 --> 00:31:10,760
There's an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning.

270
00:31:10,760 --> 00:31:16,910
Well, I suggest that that was actually eminently plausible in retrospect.

271
00:31:16,910 --> 00:31:18,530
I don't think it was actually achieved,

272
00:31:18,530 --> 00:31:25,340
but I think it could have been achieved if that had been what artificial intelligence researchers were trying to achieve.

273
00:31:25,340 --> 00:31:31,280
I think they'd have done it. But it's plausible for a bad reason.

274
00:31:31,280 --> 00:31:36,880
And those researchers were entirely right not to devote their efforts to achieving it.

275
00:31:36,880 --> 00:31:49,940
Let's see why. So I think this is one of the most unfortunate passages in Turing's entire paper because he gives the impression that a criterion

276
00:31:49,940 --> 00:32:00,260
for progress towards machine intelligence can be based on how well the programme can fool an average interrogator and for how long.

277
00:32:00,260 --> 00:32:07,400
And as a result, quite a lot of effort has gone in, not from serious A.I. researchers,

278
00:32:07,400 --> 00:32:15,200
but from others to try to pass the Turing test, according to that kind of criteria.

279
00:32:15,200 --> 00:32:21,470
But here's the problem. It turns out that fooling an average interrogator is relatively easy to achieve,

280
00:32:21,470 --> 00:32:29,180
but not by techniques that plausibly involve genuine, genuinely intelligent information processing.

281
00:32:29,180 --> 00:32:37,400
This, I suggest, is perhaps the fundamental problem with the Turing test.

282
00:32:37,400 --> 00:32:41,510
Here is a piece from the BBC website.

283
00:32:41,510 --> 00:32:47,870
I think it was in 2014. June 2014, the Turing test has finally been passed.

284
00:32:47,870 --> 00:32:55,040
Well, it's a nice bit of publicity for Reading University, but it's really not very plausible, as we shall see.

285
00:32:55,040 --> 00:33:00,070
But anyway, for the first time, we have Turing's criterion being met.

286
00:33:00,070 --> 00:33:09,860
OK, humans had no more than 70 percent chance of identifying which was the human which and which was the computer after five minutes of questioning.

287
00:33:09,860 --> 00:33:24,920
That's the claim to fame. What's going on here was revealed in 1966 by Joseph Eisenhower and Joseph Biden bound basically set

288
00:33:24,920 --> 00:33:33,110
the pattern for the way in which these things have worked since he published the ELIZA programme,

289
00:33:33,110 --> 00:33:41,270
together with a script showing how very simple text manipulation can generate a plausible conversation.

290
00:33:41,270 --> 00:33:50,330
Ingeniously, he had his chat bot playing the role of a rich and psychotherapist, echoing what the human says,

291
00:33:50,330 --> 00:33:57,140
expressing sympathy, asking gentle questions to elicit their feelings and so forth.

292
00:33:57,140 --> 00:34:04,520
And the computer responses are generated by making small changes to the human inputs, exchanging first and second person and so on.

293
00:34:04,520 --> 00:34:11,210
So here is some of the dialogue that he published.

294
00:34:11,210 --> 00:34:18,620
I'll explain later about the rather dramatic slides, but what I want to do is illustrate this.

295
00:34:18,620 --> 00:34:27,650
Using some software that I wrote called Elizabeth and Elizabeth is is a chat about creation system except negation of one.

296
00:34:27,650 --> 00:34:38,600
It allows you to look inside and see what's happening. So let's suppose I've gone to Elizabeth and I want a bit of counselling.

297
00:34:38,600 --> 00:34:48,680
Yes, like this, I hope, Will. Yeah. Sorry.

298
00:34:48,680 --> 00:35:05,500
My mum is sad. Tell me more about your family.

299
00:35:05,500 --> 00:35:20,510
OK. Hmm.

300
00:35:20,510 --> 00:35:26,420
There's clear understanding there isn't that pretty good.

301
00:35:26,420 --> 00:35:43,530
Oh, OK. Sorry.

302
00:35:43,530 --> 00:36:04,160
I can't see it on the screen in front of me, that's the problem we are.

303
00:36:04,160 --> 00:36:29,020
Yeah. Look at this, I'm very sceptical.

304
00:36:29,020 --> 00:36:45,610
I'm going to ask, can you think? All right.

305
00:36:45,610 --> 00:36:49,410
At that point, you will probably conclude this is all smoke and mirrors.

306
00:36:49,410 --> 00:36:56,310
It is all smoke and mirrors and we can I think there we are there.

307
00:36:56,310 --> 00:37:02,400
We can see what's going on within Elizabeth. You can actually see how the process is taking place.

308
00:37:02,400 --> 00:37:12,190
This group doesn't actually use any memory, but there's oh, I didn't try that, just pressing return.

309
00:37:12,190 --> 00:37:17,430
If I just press return, the response comes back. Can't you think of anything to say?

310
00:37:17,430 --> 00:37:24,750
The mum gets translated into mother, dad, into father. You can have as many input transformations as you like.

311
00:37:24,750 --> 00:37:27,660
Keyword transformations, you can see output right at the bottom.

312
00:37:27,660 --> 00:37:32,790
If the word stupid or idiot occurs that I don't think language like that is going to help.

313
00:37:32,790 --> 00:37:38,790
I think phrase why do you think phrase mother or father you get?

314
00:37:38,790 --> 00:37:45,240
Tell me more about your family. Are you the youngest in your family? What do you remember most about your childhood?

315
00:37:45,240 --> 00:37:50,190
Phrase one is younger than face to, so friends to is older than phrase one and so on.

316
00:37:50,190 --> 00:37:55,230
I mean, it is basically the responses are just pre-packaged.

317
00:37:55,230 --> 00:38:02,130
And of course, I've put in some ones about thinking and souls at the end just for a bit of a laugh.

318
00:38:02,130 --> 00:38:14,550
And that is all run by this script. So basically, you just edit the script and off it goes.

319
00:38:14,550 --> 00:38:25,080
One thing you can do with this, if you want to play with it, you can. I hope this will work if I open a same dialogue.

320
00:38:25,080 --> 00:38:36,980
If we are over here, you can't see this. I'm loading the files and bound dialogue and looks right.

321
00:38:36,980 --> 00:38:48,090
You can go. And now this is actually running the very dialogue that vies bound published, so I put it into Elizabeth, you can, as I say,

322
00:38:48,090 --> 00:38:54,340
try this on the web, and if I press function key too, we can see his dialogue, so you have to take my word for it.

323
00:38:54,340 --> 00:39:04,110
This is being produced in exactly the same way men are all alike, in what way they're always bugging us about something or other.

324
00:39:04,110 --> 00:39:09,390
Can you think of a specific example? Well, my boyfriend made me come here.

325
00:39:09,390 --> 00:39:14,610
Your boyfriend made you come here. He says, I'm depressed much of the time.

326
00:39:14,610 --> 00:39:18,330
I am sorry to hear you are depressed. It's true.

327
00:39:18,330 --> 00:39:24,690
I am unhappy. Do you think coming here will help you not to be unhappy?

328
00:39:24,690 --> 00:39:30,660
I need some help. That much seems certain. What would it mean to you if you got some help?

329
00:39:30,660 --> 00:39:35,980
Perhaps I could learn to get along with my mother. Tell me more about your family.

330
00:39:35,980 --> 00:39:41,100
You see where I got that one from? My mother takes care of me.

331
00:39:41,100 --> 00:39:47,270
Who else in your family takes care of you? My father.

332
00:39:47,270 --> 00:39:51,230
Your father can't do anything else with it. Just my father.

333
00:39:51,230 --> 00:39:59,090
Okay. Your father, sir, you are like my father in some ways.

334
00:39:59,090 --> 00:40:06,590
What resemblance do you see? You're not very aggressive, but I think you don't want me to notice that.

335
00:40:06,590 --> 00:40:13,190
What makes you think? I am not very aggressive? You don't argue with me.

336
00:40:13,190 --> 00:40:18,620
Why do you think I don't argue with you? You are afraid of me.

337
00:40:18,620 --> 00:40:24,860
Does it please you to believe I am afraid of you? My father is afraid of everybody.

338
00:40:24,860 --> 00:40:31,700
What else comes to mind when you think of your father? Bullies?

339
00:40:31,700 --> 00:40:37,670
Does that have anything to do with the fact that your boyfriend made you come here again?

340
00:40:37,670 --> 00:40:46,850
It is all smoke and mirrors. Moreover, I can reveal that in-bound cheated because in implementing his script as published,

341
00:40:46,850 --> 00:40:53,490
I had to reverse the order of a couple of the lines in order to get it to work.

342
00:40:53,490 --> 00:41:05,240
Clearly, he was doing it in such a way that he ended up with that really nice coincidence where you get bullies and the response.

343
00:41:05,240 --> 00:41:09,590
Does that have anything to do with the fact that your boyfriend made you come here?

344
00:41:09,590 --> 00:41:15,950
I can tell you the fact that that leads up with bullies was pure coincidence.

345
00:41:15,950 --> 00:41:23,330
So here's how it works. If you have the word alike in the input back comes the response.

346
00:41:23,330 --> 00:41:29,000
In what way? Most sentences containing the word alike?

347
00:41:29,000 --> 00:41:33,770
In what way is an appropriate response, something or other?

348
00:41:33,770 --> 00:41:40,070
Can you give an example? My ex, my boyfriend, made me come here.

349
00:41:40,070 --> 00:41:47,240
Your boyfriend made you come here. And by the way, remember the ex might be useful later.

350
00:41:47,240 --> 00:41:51,440
OK, I need wine. What if you got wine?

351
00:41:51,440 --> 00:41:58,460
What would it mean to you if you've got wine? So for my mother? All those questions about your family.

352
00:41:58,460 --> 00:42:03,440
My mother does such and such. Who else in your family does such and such?

353
00:42:03,440 --> 00:42:10,610
And then if there's no matching pattern, what you saw in my little script there was tell me what you liked doing here.

354
00:42:10,610 --> 00:42:18,260
One of the responses is, does that have anything to do with the fact that your ex remembered from earlier?

355
00:42:18,260 --> 00:42:24,410
OK? So it gives the impression of intelligence, but it isn't.

356
00:42:24,410 --> 00:42:30,020
OK, so let's be quite clear, chat bots are not intelligent.

357
00:42:30,020 --> 00:42:32,810
You can download my software, play around with it.

358
00:42:32,810 --> 00:42:38,780
You will very, very soon come to the conclusion that these are rather limited even buys and bound script, which is clever.

359
00:42:38,780 --> 00:42:42,260
I mean, it's extremely clever. It's got a lot of keywords in there.

360
00:42:42,260 --> 00:42:50,510
But you know, if you try to conduct any sort of sustained conversation with it, you'll find it extremely frustrating.

361
00:42:50,510 --> 00:42:57,620
They seem to confirm that Searle is right. You've got mere syntactic processing there.

362
00:42:57,620 --> 00:43:02,840
No semantics at all. Even the bit of them that looks most clever, right?

363
00:43:02,840 --> 00:43:11,960
The the switch you may have noticed and wondered about the switch between you and me.

364
00:43:11,960 --> 00:43:16,550
Sorry, where is the Oh, that's interesting, isn't it?

365
00:43:16,550 --> 00:43:28,260
That's very interesting. Let me go back to.

366
00:43:28,260 --> 00:43:34,470
The initial script that I showed you before, if I can get to it, sorry, I'm going the wrong way.

367
00:43:34,470 --> 00:43:40,400
There we are. Here we are.

368
00:43:40,400 --> 00:43:48,350
You can see the simple change from I am to you, are you all to I am I to you, me, to you and so forth.

369
00:43:48,350 --> 00:43:52,970
This is this is conducted. This happens in a very, very straightforward way.

370
00:43:52,970 --> 00:43:58,040
So the way I make Elisabeth work, it's a bit different from vice and bounds.

371
00:43:58,040 --> 00:44:03,450
That's why I was saying it's interesting vice and doesn't does everything through the key with transformations.

372
00:44:03,450 --> 00:44:14,450
And it so happens that if you take something like my sister is younger than me, you say so you are older than your sister.

373
00:44:14,450 --> 00:44:21,140
The change between me and you and my and your and so on happens very, very straightforwardly.

374
00:44:21,140 --> 00:44:29,990
So even the part of the manipulation gives the impression of most understanding is extremely crude.

375
00:44:29,990 --> 00:44:38,690
I mentioned the chatterbox that was supposed to pass the Turing test in 2014, or at least was claimed to be called Eugene Gusman.

376
00:44:38,690 --> 00:44:46,940
It's a chatter bot that claims to be a, I think, a 13 year old Ukrainian boy.

377
00:44:46,940 --> 00:44:52,160
It's a nice trick. Again, it's like having the Ruggieri and psychotherapist.

378
00:44:52,160 --> 00:44:55,820
The good thing about a Ukrainian boy is if that if the grammar goes wrong.

379
00:44:55,820 --> 00:45:06,080
All right. Well, what do you expect? Not a native speaker, right? So, you know, you can see here here.

380
00:45:06,080 --> 00:45:10,160
This is obtainable from the web. It's very much the same kind of thing.

381
00:45:10,160 --> 00:45:25,370
You've got simple patterns and responses. And this is part of a dialogue which a guy called Leonid Muszynski tried with the chat bot.

382
00:45:25,370 --> 00:45:28,070
The chat bot claims to be from Ukraine.

383
00:45:28,070 --> 00:45:42,860
Well, in that case, you want to know about what had happened in Odessa on May the 2nd that year, and it clearly showed no sign of knowing it at all.

384
00:45:42,860 --> 00:45:47,450
So basically, Chatterbox, if you try to have any sort of sensible conversation with them,

385
00:45:47,450 --> 00:46:01,070
a sustained conversation rather than just taking their rather vague outputs at face value, it quickly falls apart.

386
00:46:01,070 --> 00:46:04,790
No, I'm not saying crackpots are completely useless.

387
00:46:04,790 --> 00:46:12,560
They can actually be valuable precisely because we are so inclined to interpret their outputs as intelligent.

388
00:46:12,560 --> 00:46:16,310
Most people prefer to interact conversationally.

389
00:46:16,310 --> 00:46:21,290
And so you can have automated help systems and things like that that are based on chat bot technology.

390
00:46:21,290 --> 00:46:27,500
And often, I mean, imagine you have had an automated help system for tourists to Oxford.

391
00:46:27,500 --> 00:46:33,590
Most of the time, they will be asking questions that can be answered in a very straightforward way.

392
00:46:33,590 --> 00:46:41,930
So if you see botanic gardens say in the query, the chances are they want to know where the botanic gardens are and how to get there.

393
00:46:41,930 --> 00:46:47,840
So if you've got canned responses that give answers appropriately, that can be quite useful,

394
00:46:47,840 --> 00:46:55,280
and chat bot methods can succeed in eliciting information quite effectively.

395
00:46:55,280 --> 00:47:09,620
Just before you look at the next slide, I just want you to focus on the beautiful, low key design of that slide and compare it with this.

396
00:47:09,620 --> 00:47:18,590
So much at both, Elizabeth was picked up by a guy in a market research consultancy in Africa,

397
00:47:18,590 --> 00:47:28,340
and we did a pilot of a study whereby it was used to pick up information about people's mobile phone choices and why they change networks and so on.

398
00:47:28,340 --> 00:47:32,150
So we put kind of questions in in order to elicit that kind of thing.

399
00:47:32,150 --> 00:47:40,280
This is what the market research company did to my slide. So now you know why those earlier slides with Eliza?

400
00:47:40,280 --> 00:47:50,810
And they looked rather dramatic. I found an interesting experience, and one of the funniest things for me was just seeing this.

401
00:47:50,810 --> 00:47:53,820
OK, so let's let's look at the Turing test.

402
00:47:53,820 --> 00:48:03,440
The problem with the Turing test, I've pointed out, is, well, we know that it's not a necessary condition for intelligence.

403
00:48:03,440 --> 00:48:14,000
We've we've seen that Turing acknowledge that you can have things that are intelligent, that don't don't pass for humans.

404
00:48:14,000 --> 00:48:24,080
But unless we interpret it pretty rigorously, it's not sufficient. And that's ironically because of human lack of critical judgement.

405
00:48:24,080 --> 00:48:31,100
So it's our failure of intelligence that actually makes the Turing test an inappropriate test.

406
00:48:31,100 --> 00:48:37,400
And much of our conversation is sloppy and careless in normal life.

407
00:48:37,400 --> 00:48:41,990
You know, you go into a pub, listen to the conversation. A lot of it is very imprecise.

408
00:48:41,990 --> 00:48:51,840
That means that when we come across conversation, that is sloppy and imprecise, we regard it as having been output by an intelligent system.

409
00:48:51,840 --> 00:48:56,270
We read intelligence into it. We try to make sense of it.

410
00:48:56,270 --> 00:49:03,140
So if we have an interlocutor who comes out with the kind of vague responses that Eliza does,

411
00:49:03,140 --> 00:49:09,710
you know, vague, but somewhere in the right ballpark of relevant, we read more in.

412
00:49:09,710 --> 00:49:17,300
We read them as being caused by something that's genuinely intelligent where there is genuine meaning behind it.

413
00:49:17,300 --> 00:49:22,430
And in fact, there isn't. So it's a great shame,

414
00:49:22,430 --> 00:49:30,980
I think the cheering gave the impression that better performance in his test that means fooling us for longer is actually a criterion of intelligence.

415
00:49:30,980 --> 00:49:41,560
It just isn't. So it's a very implausible test if you interpret it in that way.

416
00:49:41,560 --> 00:49:47,950
We've seen it's more plausible as a sufficient condition for intelligence when interpreted more stringently.

417
00:49:47,950 --> 00:49:56,230
So if something produced conversation of the level of Turing's sonnet conversation across a wide range of fields,

418
00:49:56,230 --> 00:50:01,120
that might be a reasonable criterion for intelligence,

419
00:50:01,120 --> 00:50:06,160
at least in the sense of a sufficient criterion, you would think, yes, this is genuinely intelligent.

420
00:50:06,160 --> 00:50:12,730
If it can match that without having its responses, you know, candid and so forth.

421
00:50:12,730 --> 00:50:24,460
But then it seems inappropriate. Be demanding because if you end up with a conversation like that, it's going to be the more intelligence it reveals,

422
00:50:24,460 --> 00:50:32,020
the more material there is which may push us away from seeing it as human.

423
00:50:32,020 --> 00:50:41,860
And there's another problem as well. This is brought out in a paper by Robert French published in Mind in 1990.

424
00:50:41,860 --> 00:50:52,450
You might think this is overkill, but it's fairly easy to devise questions which elicit our cultural understanding of things,

425
00:50:52,450 --> 00:50:57,490
and it's hard to imagine that a computer could be programmed with any ease to do this.

426
00:50:57,490 --> 00:51:05,650
So, for example, right flood as the name of a glamorous model or a cuddly toy?

427
00:51:05,650 --> 00:51:14,080
Well, any native English speaker will see that fugly is a rather bad choice if you're a glamorous movie star, right?

428
00:51:14,080 --> 00:51:18,050
But cuddly toy? Yeah, that's that's that that works quite well.

429
00:51:18,050 --> 00:51:23,950
If you can imagine a baby having a flood, Lee, that's very dear to it.

430
00:51:23,950 --> 00:51:31,160
And questions like this rating things pulls on, draws on a lot of our cultural understanding of things,

431
00:51:31,160 --> 00:51:38,410
and it would be extremely difficult to programme a computer to be indistinguishable from a human in all those respects.

432
00:51:38,410 --> 00:51:43,420
But that really seems a bit irrelevant. I mean, it's this, isn't you?

433
00:51:43,420 --> 00:51:54,160
If you're testing for a computer being intelligent, these sorts of trick questions we feel shouldn't really be the main point.

434
00:51:54,160 --> 00:52:00,520
Okay, so the Turing test is a failure. But can we do better?

435
00:52:00,520 --> 00:52:08,410
Well, I think we can, actually. I'm just adding a couple of letters there and changing it to the tutoring test.

436
00:52:08,410 --> 00:52:16,540
So the big problem with the Turing test, OK, is it is to get plausible performance in a Turing test,

437
00:52:16,540 --> 00:52:20,140
at least, you know, five minutes of questioning, average interrogator and so on.

438
00:52:20,140 --> 00:52:25,840
The best way to go is deceit. You pretend to be something you're not.

439
00:52:25,840 --> 00:52:32,920
Now it's it's not at all desirable to have a test of intelligence, which depends on hiding things.

440
00:52:32,920 --> 00:52:39,310
Rather, we want to test that actually involves revealing things, revealing understanding.

441
00:52:39,310 --> 00:52:50,500
So suppose we imagine a system that's designed to tutor maybe in a limited domain will almost certainly in a limited domain, take chemistry.

442
00:52:50,500 --> 00:52:55,720
So chemistry is a very sophisticated subject.

443
00:52:55,720 --> 00:53:05,620
It's complicated, difficult. There are lots of things that linked together in order to learn chemistry of any sort of deep level,

444
00:53:05,620 --> 00:53:09,940
you need to understand complex informational structures and how they fit together.

445
00:53:09,940 --> 00:53:18,850
OK. Things to do with, you know, quantum schedules and how different elements and different molecules interact.

446
00:53:18,850 --> 00:53:24,550
Now, imagine you had a system which was designed to tutor chemistry.

447
00:53:24,550 --> 00:53:30,190
It's not going to pretend to be human. It's not going to try to be indistinguishable or anything like that.

448
00:53:30,190 --> 00:53:38,740
The test is how well can it tutor? So can someone who doesn't understand the various complexities of chemistry,

449
00:53:38,740 --> 00:53:45,490
go and interact with the system and then say at the end of an hour, it understands something.

450
00:53:45,490 --> 00:53:51,010
The person understands something which he or she didn't before and maybe understands

451
00:53:51,010 --> 00:53:56,800
it just as well as if they had been tutored by a competent human tutor.

452
00:53:56,800 --> 00:54:04,480
So there's something in the spirit of the Turing test. You write the test is can cabinet achieve the competence that a human could?

453
00:54:04,480 --> 00:54:13,870
But we've now got something which works by eliciting real understanding in a human and therefore the more

454
00:54:13,870 --> 00:54:19,870
it has built into it in terms of complex informational structures linking together in appropriate ways,

455
00:54:19,870 --> 00:54:26,980
the better it will do. It's revealing, understanding rather than pretending.

456
00:54:26,980 --> 00:54:33,350
And it has the considerable virtue also potentially of producing seriously useful products.

457
00:54:33,350 --> 00:54:38,120
I put this to Hugh Learner, who organises the LERVE No-Prize,

458
00:54:38,120 --> 00:54:43,220
suggesting that he might wish to reconfigure it in the future, but I'm afraid he's not interested.

459
00:54:43,220 --> 00:54:48,650
There we go. I tried. And that's it for today. See you for the last lecture next time.

460
00:54:48,650 --> 00:54:51,614
Thank you.

More from Lectionem

Featured on

Comment

Your email address will not be published. Required fields are marked *