Identity Copulae (cont, `:' and others)

oe: Welcome Jan Hajič.

oe: We spent yesterday afternoon discussing the copula, but not the ones we had in mind that we had proposed based on what we think of as identity copulae. We had reached something of a consensus that in most uses the ones we were discussing have something predicative going on. Propose that we talk about copulae again this morning, and then one more topic before the end of the day.

Alex: [agrees to chair] We should start with a quick summary of where we were yesterday. I'd like to see for example Browne is a manager and The theory is that Browne arrived in the candidate representations.

Emily: That could take some time.

Alex: Or maybe I should just remind everyone. With AMR because of the principle that each node should denote one thing, there's essentially one node. For the Prague analysis, again you have roles for these different parts. That was quite common for these analyses --- XFR and ERS. ERS has several be rels, essentially capturing the fact that either predication can play different roles in the equality and be of different types (clausal or individual). I can't remember with the Prague analysis whether the roles are different?

Emily/Silvie: Yes.

oe: And in some cases it was identity, in other cases it was intersection.

Alex: And we decided it was a subset relation at the model theory. So time to move on, and let's look at AMR first.

Tim: Close to the analysis for The reason is that Browne arrived. Ideally you'd subdivide reason into causation and justification, but we always do causation for now.

Dan: Left ARG0 blank for cases like The reason that we left is that Browne arrived.

Woodley (aside): See, reason has two arguments!

Alex: AMR has the marvelous luxury of not having to construct this from the grammar. I'd like to see a grammar analysis, so let's go to the ERG. Can you do this one?

Dan: I do something.

Emily: This is one of the ones we repaired.

oe: We haven't yet quantified how many of ours come from the grammar, and how many require manual patching. There were 7 of those, and this is one of them. Only difference between this and the ERS for The reason is that Browne arrived. is in the name of the predicate symbol.

Woodley: Should this be a qeq?

Dan/Alex: Yes.

Alex: At the discourse level, the reason could transcend a sentence boundary. The reason: gives explanation relation at discourse level and that second argument could go over. That doesn't matter for these purposes, but you would want quantifiers to be able to tuck in under that.

Dan: When I fix the grammar, I'll put that in, since not having the qeq is the marked case.

oe: You might even say it's a borderline case to sentence segmentation. This analysis would be compatible with that.

Dan: Your reason? Browne left.

oe: Not very creative, put something in that should be interpreted in the identity/intersection space.

Ann: Not necessarily for a colon. Point one: Browne arrived.

Ann: What's the generalization about what colons do? If NP, take the index and if S take the LTOP.

Dan: To summarize: Browne should arrive.

Alex: What do you do there --- nominalization on the left?

Emily: Why is Point one: Browne arrived. an example of why it's not identity?

Ann: If you want to be loose enough about what you mean by identity.

oe: It's not like identity in the model theory.

Ann: I don't mind it being loose identity, and it has to be for examples like A team is 15 players. It's not identity in a very tight way between two individuals. There's some sort of identity there.

oe: That's the kind of definitional, right?

Emily: It's definitional in the way that whales are mammals

Alex K: It's definitional in a different way. whales are mammals is an empirical fact, a team is 15 players is contingent.

Emily: It's still definitions.

Ann: The definitional property is only a side fact. Other examples that don't have the property.

Dan: Our team is 10 players.

Woodley: Other examples Chapter 15: Browne arrived. is much more opaque.

Ann: That's what I was trying for. That's a much sharper example. Can still say it's identity in some metalinguistic sense.

Silvie: We treat it as an apposition of one nominal and one predicate phrase and we have something to hang these two things on, which is the colon, so we make the colon something like a conjunction. We treat it as a sort of conjunction.

Alex: Here's it's relating...

Silvie: Normally it would be actor and patient, because there would be a predicate above that, but the conjunction itself cannot be the predicate because it's only a conjunction and the functor marks the type. This can be either a predicate when it's a verb, or denom when it's a NP. We'd rather have two predicates or two NPs, but here we have too different things, but it's still doable.

Dan: If this were The reason: my headache

Silvie: That would be two denom.

oe: This is what you call the paratactic structure?


Jan: If this were under the verb, there would be two patients.

Dan: He had a good reason: my headache.

Jan: Yes.

Silvie/Jan: The functors (instead of DENOM, PRED) would be whatever they should be given the structure of the longer sentence.

Jan: DENOM, PRED label empty dependencies; "this is the root". Because we have a few of them, we can distinguish between true predicates and others.

Dan: You'd end up with a very similar analysis if the segmenter had broken this into two separate clauses, different from what we saw in AMR and ERS. You're just saying they're side by side.

Jan: Right.

oe: The circle indicates that this is a paratactic structure, and the roles here aren't relevant to the immediately dominating node, but to the larger context.

Alex: Will we see more like this with coordination?

Silvie: The difference is whether these are the roots of the sentence.

oe: [Brings up coordination example: The dog the cat or the picture arrived.] We see the same group formation with three members, all assigned the ACT role relative to arrive.

Alex: Let's move on then to XFR and then we can do Johan.

Dick: I didn't check over this one. I think it's very unclear how to interpret a colon. I think there are a lots of places where it doesn't correspond to identity copula and I don't think there's anyway for the semantics, without a lot of other processing to figure out which ones are. So here it's treated as coordination.

Alex: So that could for something like John's cat: Mary spoils her. where the copula sense wouldn't work.

Dick: Or Let me tell you: ...

Alex: So you're carving the different senses of the colon different from the ERG. The Prague analysis and your analysis, because you're calling it conjunction, you're leaving it up to pragmatics, where the ERG is making a decision in the grammar.

Dick: Normally with conjunction you can have several conjuncts, but not with a colon. It may be more sensible to call it a colon relation.

Emily: And leave it underspecified?

Dick: I'm afraid so.

Silvie: We put semantic labels/functors on the coordination relations: disjunction, conjunction, contrast. Some semantic analysis, but just a little.

Dan: I'm interested in why you think that our one is more contentful. Why do you think there's identity there?

oe: I suggested the parallel.

Dan: I think we were trying to be just about as neutral as Dick was describing.

Dick: But what I'm doing is too specific.

Ann: If we were going to have relationships between sentences (and not restrict our sentence-level processing), this would be like the relationship between two sentences. Is everyone happy with that?

Woodley/Emily: But the NP fragment would get more wrapper in that case.

Ann: Yes. I was just talking about the relationship between the two, like Dick's coordination could be a reasonable relation.

Johan: From my corpus: The final score : [multisentence] Seems like a topic relation. But here, it just means the same as The reason is that Browne arrived.

Alex: So this wouldn't work as an analysis of John's cat: Mary's spoils her.

Johan: I agree. That would need a different analysis. The one with the score, I don't know what you'd do with that for example.

Dan: There's a close analogue: We had a good reason: Browne arrived. The reason could be part of a prior clause.

Alex: A reason: Browne arrived. there's still an anaphoric dependency to prior stuff. You have to bind that reason, and once you've done that the colon tells you that the thing to the right of the colon is what that bound reason is. I think Johan's analysis helps there, because he's got different things going on because of the way the content of reason gets incorporated into the discourse context (to the left of the +).

Jan: When we discussed colons, it was in a slightly different context because most of the ones in our corpus are from direct/indirect speech. We concluded that this is a shorthand --- shorthand means lots of ellipsis around it. Sometimes what's missing is clear. Sometimes less so: He entered the room: [quoted speech] I think this sort of fits here as well, I'm saying that a reason of what was before is. Too much of an ellipsis to resolve it in the annotation. We identified we need three things: (i) ignore the colon (graphical separation, like quotes), the thing after the colon is the argument as if it weren't there, (ii) make it a conjunction because the verb was different, so as we think about and said (in he slammed the door: `I hate you), and (iii) what we did with the reason, where it's so far that interpreting it as saying is wrong. If we looked closely, we could probably get more individual types, most of them debatable.

Alex: In SDRT we've analyzed the colon as more specific than a full stop in what it contributes to the discourse in two ways: (i) the two things separated by the colon have to be parts of segments in a coherence relation together and (ii) some possible coherence relations are ruled out: but, or, move the narrative on. It could be background and certain other relations. A. B. could be anything.

Dan: You're saying that as if you think there's just one colon in the language. I'm very much with Jan that there are several kinds.

Alex: There are several kinds, but all of them have that in common.

Johan: I think you're saying that they have different coherence relations, drawn from that set.

Alex: Moving on to The plan is to sleep more.

Johan: Not different from the other examples. There's some agent who wants to sleep more (don't know who), and presupposed plan is identified with the sleeping proposition.

Alex: But you're not binding the planner to the one who wants to sleep.

Dick: We've got some null pronoun doing the planning, which is the same as the one doing the sleeping.

Alex: But it's lexical semantics that's doing that, right?

Dick: Yes.

oe: So paraphrasable as Someone plans to sleep more.

Dick: Yes. That's the kind of transference of the control from the verb to the nominalization.

Woodley: Neither of you have the info that the more is more than something and that something is a presupposed entity?

Dick: Right. There's an elliptical comparative there.

Emily: What do you do with more?

Dick: Nothing very interesting.

Dan: We're not talking about more now.

Alex: Do you have two different plans, one for plan for and one for plan that?

Dick: Quite possibly.

Alex: And do they both derive their controlling crap from the verb?

Dan: Technical term.

Dick: Probably does, yes. Well actually, a lot of that is going to depend on corresponding subcat frames for verbs. It may well be that the complement on the nominal will give you the different subcat frame.

Dan: Oh, plan for NP not plan for X to Y. So there's a third case with the infinitival.

Dick: Certainly no subject control.

Dan: You're saying that what's coming after the copula is filling the complement of plan, right?

oe: The sleeping is the topic argument of the planning.

Dick: I don't know how it managed to do that. It's clever that it did...

Dan: I was struck by that too. Notice what I think is the contrast here: The reason that we left is that John arrived. but I don't think I can get The plan to stay awake is to sleep more.

Emily: The plan for staying awake is to sleep more.

Dan: You've done a courageous thing here which is probably right for plan, but it wouldn't work for The reason to buy books is to get smarter.

Dick: Courageous or fool-hardy.

Alex: *The plan that I get more energy is to sleep more.

Dan: Give me any sentence with that subject NP that would work well.

Alex: The plan that we would have a kebab lunch failed.

Dan: With is

Alex: The plan that I get more energy is a good one.

Dan: Thank you. There's something about the sorts of the arguments is mysterious. That you can do this magic that Dick did and that it seems right here (but not in other places).

Alex: Now is a good time to see Tim's analysis.

Tim: Unsurprisingly it's the same thing.

Dick: But you kinda know how you did it.

oe: But he has access to competent speakers of the language.

Alex: But actually we're debating whether you want to do it in the grammar.

oe: So there's an act of planning to sleep more.

Tim: If in building up the AMR you ended up the thing planned is equal to sleeping more, since we unify identical things.

Dan: The plan isn't to sleep more, it's too drink more where does the negation go?

Tim: It would be on plan.

Dan: So without that it means there is a plan to sleep more?

Tim: It means we're not planning to sleep more.

Dan: Without the polarity minus, is that synonymous with There is a plan to sleep more?

oe: But you don't want to control the agents?

Tim: We can't get control on elided agents.

oe: I think you did on that other agents.

Tim: On completely omitted agents. If it were Abrams's plan is to sleep more then we could.

Alex: And what about the imperative: Plan to sleep more!

Emily: I think a better example is Make a plan to sleep more.

Tim: Yeah, you get the injected you, and so all the control pops up.

oe: It's not that you deny the control, you just can't show it because you can't hallucinate things.

Tim: Right --- if we could, we wouldn't know when to stop.

Dan: Do you really think there's control between the possessor of the plan and the sleeping?

Alex: Not for the possessor of the plan, for the planner.

Dan: My mom's plan is to sleep more.

Dick: I think you can get e.g. for the whole family to sleep more but it requires a lot of context.

Dan: So I don't want to do it the grammar.

Woodley/Alex: Remember, his control isn't grammar control.

Tim: For something like that, you ultimately have to go to discourse pragmatics. I don't think it's purely syntactic.

Alex: But for Dick, it is a matter of grammar plus lexical semantics.

oe: Aren't we going back to something we agreed to disagree on on the first day?

Alex: Moving on to Prague.

Silvie: We have to have an actor of sleep, and have to decide if it's a generic actor or something else, but we know it's the same one as the benefactor of the plan, not the planner. It was planned for them to sleep more.

Emily: It seems like there are two readings to It was planned for them to sleep more. depending on the for as complementizer or benefactive marker.

oe: Where does the BEN node come from? Is it from a lexical frame?

Silvie: As an annotator, I felt that there was a common participant between the planning and sleeping.

Jan: The guidelines say that when you identify control, you have to put in the controlling node.

Silvie: It's not as obvious as He's planning to control.

Jan: The guidelines for control are that if you want to use the #cor node, then you have to include some node.

Silvie: The decision is whether I use generic or #cor here, and since I used #Cor, I need a controller.

Alex: I assume you wouldn't put the control in for The plan for my graduate class is to lecture on Thursdays.

Silvie: There the class is being taught on Thursday? If I want to be very specific about what's happening to whom---I would proably skip it as an annotator, because it's difficult and low-frequency---but if I wanted to:

Alex: Let me put in a pronoun The plan for my graduate class is to lecture them on Thursdays.

Silvie: Then the patient would be a #cor and the benefactor would be there as well. I would make the same structure, only the #cor node would be the PAT not the ACT.

Alex: But I put in a pronoun, so it's not control.

Silvie: Sorry, then them would be coreferential (textual coreference) with class.

Alex: So back to The plan to my graduate class is to lecture on Thursdays.

Silvie: Then that's the benefactor with #Cor.

Woodley: It's the actor of lecture that's grammaticized.

Silvie: You're right---the arrow would have a different color with a generated personal pronoun.

Woodley: But there would still be a #cor node for the ACT of lecture.

Silvie: Right.

Alex: What's missing? ... ERS. So this is like The theory is that I arrived.

Dan: And because of my religious belief, the sleeper is unbound.

Johan: He is to visit Boston.

Dan/Emily: That's a different be.

Johan: How can you tell?

Dan: There's a different meaning there. And there he's visiting Boston, and there's this purposive addition.

Johan: That's interesting. I don't make a distinction, but maybe I should.

Dan: And in fact, this is in our patched examples, because in the grammar I don't make the distinction, and only get the purposive one.

Alex: You also want the qeq there.

Emily: We didn't?

Dan: When we patched them, we didn't put the min.

Johan: Maybe this is an animate/inanimate thing.

Dan: This book is to go on the shelf.

Alex: So now what? I'm sick of the copula.

Emily: Let's do do be do be do.

Dick: XXX: Hopeless analysis.

Johan: So many boxes there, there's probably something wrong.

oe: And two new operators we haven't seen in this discussion.

Johan: x1 is Browne. If there is an event in which Browne could be the agent, then there is an arriving event. I made this up. The parser couldn't do it. (That's why Browne isn't in the presupposed box.)

oe: And e1 = e2?

Johan: That's the copula.

Alex: So if there's anything that Browne can do it's arriving.

oe: We did say, let's admire...

Dan: I think that looks pretty nice. And you did have the arriver and Browne?

Alex K: If they're identical events.

Dan: The one thing Browne couldn't do was arrive. I thought you were getting a free ride on Browne being the agent in both events because of the identity of events.

Alex K: He also says it explicitly.

Johan: Dan has a different point---I think the event would be in the negation so it would not be accessible.

Woodley: With the existential, you're not going to get the implication.

Johan: It would be the equivalency between the thing and the event.

oe: [edits ascii art]

Johan: [shudders and then] Right --- I don't have anywhere to hang the tense.

Dan: Then all Browne could do was not sing.

Alex: That's like a donkey sentence, so the consequence stuff can pick up from teh antecedent.

Dan: I was out of money, so all I could do was not buy a train ticket.

Ann: I'm happy to call that grammatical, but I think it's pragmatically hopeless. I don't think it means what you want it to mean.

Woodley: Or anything.

Dan: My only option was to not sing... All I could do was not sing.

Johan: Okay, we'll do that next year.

oe: Okay, we'll have a pre-meeting karaoke.

Johan: And then we'll leave without...

Woodley: singing!

Silvie: The interpretation is there's a copula predicate and a relative clause and the relative clause has a grammatical control between all and the PAT of do, and this entire thing is what we call his arriving.

Dan: Who's doing the arriving?

Silvie: Sorry, Browne is missing from arrive.

Dan: Good.

oe: So that's another grammatically controlled.

Silvie: There would be a generated node and then ... is that grammaticized?

Jan: It could be the act of that verb, it could be the patient. It would be just textual coreference.

Dan: It's actually grammaticized. You have no other choices.

Silvie: Isn't that a grammar template that's already too complicated.

Jan: All Browne could do was be killed.

Dan: It's not the thematic role, it's the subject. When you get around to putting it together, you don't have pragmatic control. Whatever the available argument is of that VP inside the complement of be, the grammar says it has to be the same as...

Silvie: *All Browne could do was Jane hasn't arrived.

Jan: Well there's the verb do down there.

Dan: You're right, it's gotta be do.

Emily: I think Silvie put her finger right on it when she said that's a really complicated grammar template. That's a really complicated grammar template!

oe: Usually Dan is more conservative about accepting control, but here he's embracing it.

Jan: We can make a clear representation with all the cases, including negation.

Silvie: All Browne could see was that Mary arrived.

Emily/Dan: That's not the same construction.

Tim: This is another magic AMR predicate: include-91 is our set/subset thing. Of the set of things Browne can do (ARG2) is arrive (ARG1), and the particular relation between them (ARG3) is that those sets are coextensive.

oe: You say you never annotate the copula.

Tim: We always turn it into some predicate.

oe: You'd have to say this predicate emerges from the copula.

Tim: It's complicated to say something comes from the copula. Here include maybe comes from all.

Dan: What about The only thing he could do was arrive. Where does include come from there?

Emily: He doesn't have to say it comes from anywhere.

oe: So include-91 entails an ARG3 that shows the relationship?

Tim: Only when we need it. include-91 says that ARG1 is a subset of ARG2.

Alex K: Do you want it the other way around? The things he can do are included in the things that arriving.

Emily: It's not all arriving situations, it's Browne arriving situations, if that helps.

Alex: If it's the one thing, what does one modify?

Tim: You'd have to put it on the ARG3.

oe: Is this parallel to a generalized quantifier---a relation between two sets?

oe: What about :ARG3 (c / cardinality :op1 "1")

Tim: We'd probably just say literally one, but that's cleaner.

Jan: Don't you just need one other bit for include---proper or not proper subset?

Alex K: Don't you want a to refer to the set of all arriving events with Browne as the agent, right? But normally, you're just using that to assert the existence of just one event. I find it really hard to... AMR is very cautious about resisting all attempts to put a model theory in there, so maybe this isn't the right level of discussion, but I feel that the same subgraph with (a / arrive) you'd take differently in another context, but in this case you want something like lamba event, events in which Browne arrived. If you were to try to assign a truth conditional interpretation, you'd have to be very careful about how the same term refers to one event and in another to a set of events.

Alex: Unpacking the truth conditions of this would be non-compositional.

Alex: :ARG2-of is subcategorized by thing.

Dan: This is a construction on our analysis that requires the do in the relative clause (as Jan noted), and the infinitival (or even bare VP) complement of be, and so this is yet another copula.

Alex: What is it that does the control here?

Dan: It's fancy.

Emily/Woodley: Is it algebra-compliant?

Dan: Yes, that was the triumph. This was a slightly unfair example, as it's one I spent 6 months working on.

=== High-level discussion ===

Silvie: Everyone has had enough of copula? Good.

Johan: Can I ask one more question? (To Dan, because he has so many different ones.) Do you have a different copula for The government is to blame.

Dan: I don't think I know what that construction is. A fixed expression?

Johan: I've got more of those. This book is to be discussed.

Dan: That's the same as The boy is to go to the store. discussed is passive. It's the blame one that's odd, and it's odd because it's idiosyncratic.

Woodley: This food is to eat.?

Silvie: Shall we take coordination, or something even more interesting? Someone said yesterday there wasn't much to talk about that one.

oe: That may have been a rare instance of sarcasm.

Alex K: Oh, another one!

Dan: Maybe do a bit more strategizing, goal-setting. (From discussion of the coffee break.)

oe: Now? Okay.

Silvie: Then you get to moderate.

Dan: And you can go later.

oe: So move the wrap up earlier, or give it more time?

Dan: Move it earlier.

Woodley/oe: And it will take a lot of time.

Alex K: It could be nice to change pace. I'm phenomena-ed out.

Ann: Me too.

Johan: I can handle more phenomena, but if we'd like to choose, I'd like to do comparatives.

oe: At least pick our last phenomenon, before we allow ourselves the indulgence of the high-level discussion.

all: Majority is for comparatives.

Dan: Summarizing the discussion of the coffee break: Let's look at little bit of what outcome we might wish for from this meeting beyond a better sense of clarity of details of the various approaches. Johann suggested pursuing the building of a enriched corpus, with real-world examples illustrating the phenomena we're interested in and that we'd like to have our engines/annotation schemes tested against, and then talk about designing the characteristics or parameters of that corpus. So one direction is to talk about goals (like that). Alternate way to spend the next hour would be to look at how we could better serve existing or imagined consumers of our work. Where we fit into a pipeline, what other resources people have, what they'd like to have from this black box that goes from English to brackets, colons, boxes, or lines. If it's too poor or too rich we don't get uptake. Where's the happy balance? Dick and Johan working hard on that, ERS somewhat lazy, AMR aggressively looking at the other side of the bridge before building it. Prague might also want a larger user community. So we could go on science/evaluation route or the sales/marketing route.

Johan: Let's do both, starting with the first one.

oe: I also wanted to ask for feedback on the idea of systematizing along various dimensions things we understand better at the end of this meeting.

Ann: I think that would be helpful for the having consumers exercise.

Alex: And actually the first relates to the second. If you successfully market, it'll give you a way of doing extrinsic evaluation.

Alex K: It also goes in the other direction. If we had a 5000 sentence corpus of cool semantic phenomena in a real world environment with analyses in all of our frameworks, would allow people to look at the and compare.

Emily: With on the front-page documentation this matrix Stephan is talking about. The point of the matrix would be to understand the dimensions of variation and help people see what they're looking at.

Alex K: Many of the distinctions we've seen are quite subtle, and I'm not sure we can summarize them in a word or to.

Ann: Better would be to link to the 10 crucial examples that show the difference.

Dick: A matrix will be too detailed to be a good marketing tool.

Dan: Let's talk about marketing, with the corpus annotation as a subgoal.

Johan: If I would do this again, I would ask for QA pairs or RTE pairs and then explain how you go this.

Dick: Agreed.

Alex: For the next one, not this one.

Johan: For marketing, you have to show that it works.

Dan: You've just hinted at a couple of tasks where you think that could be convincing.

Emily: Just showing that we can do something isn't enough.

Ann: But if we can't get it to work, then why should anyone else think they can? I think the concentration on inference is a good idea. Here's an inference we think we can make, what other resources do we need, what assumptions do we have to make about domain.. would be a great topic for another meeting.

Alex: And I think this is the best outcome one can hope for with some of these, because your logical forms are descriptions of forms in some other object language but you haven't said what that is.

Ann: I don't even like the object language anymore. I'm an inferentialist now, I'm happy to play with MRSs.

oe: Being an inferentialist means working at the level of representations that support inferences and tossing out what Johan holds in high regard.

Johan: No! I'm exactly with Ann.

oe: But you seem to like mapping into a logic and using that to do the inference.

Johan: You need logic to do inference.

oe: On the other end of that continuum, I view AMR and FGD as writing representations of meaning/linguistic meaning and explicitly not connecting to logic. To annotate, for example, you have to ignore scope.

Tim: And getting clear how much of an issue that is for things like inference is really important.

Dick: From a marketing point of view, just saying we're doing linguistic meaning won't appeal, because there's a lot you need after that to do inference. It's useful to see what the linguistic meaning is to get a better sense of what the gap is, but you still have to fill in the gap.

Alex: Or tell them how to. You have to lay down your vision of what the grammar gives you and what the final meaning is.

Ann: It does mean redefining some of the standard tasks---e.g. our corefernce and OntoNotes coreference aren't the same. Standard coref systems don't work over MRS. Actually being able to say here's how we're talking about this task ... and here's lots of data you could use to do this. Let's try doing this with our representation instead of the textual representation.

Alex: One example where that comes up is that standard coref tasks don't take account of abstract anaphora---though some of these formalisms are set up to do that, like Johan's. With little work you could set up MRS. And you could too Dick with your context labels. If you were interested in finding abstract anaphora in a corpus and finding their antecedents (and I'd love to have a system that would do that for me), we've got to tell them how to bridge that gap. What is it that you would hook up to in that context-resolved logical form.

Alex K: Your'e saying if we want this to be useful for users who are not us, there are certain tools not just computing semantic representation from the grammar, maybe we can do better because we have these more detailed semantic representations and there are certainly resources that we do need to support tasks. That begs the question of where do these come from? Is it technically and politically grant-acquiring feasible to identify those things that we would need and get them built? Are those extra resources sufficiently similar to be shared across these frameworks?

Woodley: one way to figure that out would be to do what Johan was suggesting about what resources we would need to handle specific inferences.

Alex: If we were doing what Johan suggests, there's some people missing from the group.

oe: Thinking of the textual entailment community?

Alex: Less that, then an expert in distributional semantics, someone from WordNet or FrameNet.

Ann: Let's not go too far... We would need to decide what type of inference we're talking about. We probably have enough knowledge about WordNet to know if it woud work. Don't need experts on all of the different phenomena.

Johan: I agree. We don't need expertise from outside for the other phenomena.

oe: I interpret the FraCaS testsuite as examples about logical inference. To me there is a continuum from teh FraCaS to this year's SemEval Task 1 (textual entailment, but doesn't require background knowledge)...

Dick: I think it's the background knowledge that's the crucial distinction. Logical inference is one way to do textual entailment. FraCaS was set up to minimize ambiguity and the amount of world knowledge needed. That's the level at which it differs from most of the stuff in RTE. The latest SemEval was somewhere in between. It may well be that for us we want to focus on something that relies more on lexical knowledge than world knowledge. I think there's a distinction to be drawn there, exactly where is murky, but you won't be able to get anyone from outside to agree to that distinction without reasonably sized corpora that rely on lexical knowledge.

Johan: And grammatical knowledge --- marketing for the grammar-based approach.

Dick: There has to be a story about how you would supplement with world knowledge, to show that the linguistic inference is a plausible first step.

Emily: You have to do more: You have to show that the linguistic inference actually helps beyond whatever you're supplementing it with.

Ann: The FraCaS set also distinguishes between plausibility v. monotonicity. And most people are interested in plausibility. The problem with the original RTE was that the distinction between the valid and non-valid inference was just silly. I don't think the RTE I know corresponded particularly well to the sorts of tasks where you want real inference, as opposed to summarization. The question of how you find good potential consequents is a really difficult one.

Dick: Agreed.

Dan: What's the scale of the resource that would be needed to exhibit the distinctions, to be interesting

Johan: A couple of thousand?

oe: SemEval task 1: 10,000 sentence pairs.

Dan: Always pairs of sentences?

Dick: You want chunks of text.

Johan: Get one text, and use it four times --- with different hypotheses to classify. That's the quickest way to get lots of data.

Dick: That's like the machine reading task.

Dan: Who could construct that data set?

Johan: Us. Look at the plan is to respond --- there was a plan responding .. no. There you have one.

Ann: That's effectively what we're doing when we discuss these examples.

Alex: One thing I don't like about RTE is that it is what human's would take the discourse to be. John talked. So did Mary./John and Mary talked. we'd say yeah because we assume those two sentences make a coherent discourse. We have to be clear/careful about that.

Johan: Makes a lot of assumptions to.

Alex: But they're not explicit about them.

Johan: Yes they do. E.g.: Proper names in the text & hypothesis refer to the same thing. Or assume parallel PP attachment.

Ann: I think we do rely on what a reasonable human would do, though there is room for spelling some of that out.

Johan: In FraCaS there is some.

Alex: I'd like to keep with that best practice then. The same goes for quantifier scope ambiguity. That original Ryle paper on uDRS annoyed me because he was making assumptions and not being explicit about them. It wasn't purely logical.

Dan: If we constructed that 10,000

Johan: 1,000

Dan: pair testsuite. Let's start with 1,000. And those pairs are the kinds of things you were just mentioning as examples, would those 1,000 pairs convince a non-specialist that we had an engine that was doing something interesting.

Ann: Combine artificial and real-world examples that do the same thing. Pair them up to show what's going on.

Johan: The way I see that is that you have a standard exemplar, click for analyses of the systems, and then real data divided into training & test. It could even be that we hide data, and do a shared task.

oe: I had my coffee next door, I'm tempted to ask you to articulate the high-level goal in this.

Dan: It came from Johan saying it would be nice if we could have an outcome from this three-day gathering other than greater internal wisdom, and the shared sense quickly around that table is that we would be happy if there were more uptake out in the world of what we're doing and less ignorance of what semantics is. We could save everybody a lot of embarrassment (on the part of those who are assuming it's easy) if we could get our work out into that marketplace. Would need to allow ourselves to evaluate what we do and don't do well, figuring out what other resources are needed in the pipeline, and some solution to what we would show someone out there what our black boxes can do and entice consumers. We wanted to answer those easily anticipated questions somehow.

Alex: One application for deep linguistic engineering is dialogue systems. We've talked a lot of entailment a lot, but not a lot about dialogue structure and the role of implicature. We desperately need to reach out to dialogue systems folks who are experiencing a plateau with reinforcement learning. We need something quite distinct from an RTE style task.

Ann: Like:

Alex: Question answering: does this answer the question? Who came to the party?/Look in the visitor's book. Valid response, but not an answer.

Johan: Is this us or AI?

Dan: Does our technology help?

Alex: Yes. Because lexical semantics helps. Predicting agreement would be another one, since agreement is often implicated.

Ann: There's a way of setting this problem up as an inference problem. You can set that up so that the bit that we can do is given X we can do Y. {scribe missed it}

Dick: What's going on in a dialogue system isn't textual inference, but it's inference.

Ann: Can you paraphrase the steps you need to take in a dialogue with text?

Alex: With agreement, I think you could. Alex says Max fell./Tim says John pushed him. Hypothesis: Alex and Time agree that Max fell.: I would say yes. You can set up an RTE for that for agreement cases. But there's a lot more going on. Is an imperative commanded? Go to the store. as a dialogue initial thing. If it's in answer to a question like How do you make lasagne? Soak the noodles... that's a whole bunch of imperatives, but they're not requests for action.

Ann: I would be interested if there are things that you think are inference of the linguistic form that cannot be rephrase as some sort of linguistic inference. I believe that system should be able to explain it's reasoning, and that would allow it to show the inferences. And I think that humans can explain their reasoning. I would be very surprised if there were some sort of inference you require that you couldn't do like this, but it may be too long-winded.

Alex: Exactly that. I'd be very worried if one couldn't do that. Reinforcement learning wouldn't be able to do it, but a symbolic system could. But it would look awful --- a text 20 pages long.

Dick: One thing we'd have to look out for, looking at most of the personal assistants out there, it's very task specific, and doing the right thing relies on a lot of domain knowledge about the task.

Alex: You can articulate the background knowledge etc linguistically.

Dick: But that's a lot.

Alex: 20 pages.

Ann: I don't mean you have to do every single logical step, just what some of the consequences.

Alex K: We don't have to show that we can do something everyone else can't, just that with our tools, it's cheaper. E.g. modeling more than one domain is expensive. One thing that comes to my mind is like Emily's paper from ACL years ago on porting grammatical knowledge more quickly to another language. It's easy to do phenomena one at a time with shallow methods. The payoff with deep methods is that the adaptation is cheaper. It's not necessary to show that the phenomenon is impossible with shallow methods.

Dick: At Nuance, e.g. there's a lot of acquisition of shallow grammars---relearning a large of subset of English for each new domain. It would be far better to start with a good grammar and then specialize it to the domain.

Ann: OTOH if your good grammar is too all-purpose, it may be that specializing is too expensive.

Alex K: That's a question that feeds into Dan's point. What are the things that we should provide to end users. A balancing act. If the grammars are too uninformative or too detailed, no uptake. How do you scale that level of detail?

Ann: I've got some personal experience with this now working with David Mott as he tries to apply the ERG. Despite the fact that he's incredibly willing to learn about this stuff, it's an extraordinarily hard thing to do.

Alex K: What's hard about it?

Ann: Everything, in some sense. One of the things that's most complicated, is that for someone not very well trained in linguistics, isolating the example from the particular case you're working on now is really hard. You can hardwire more in with limited-domain grammars. And there's some things in the ERS which seem very syntactic.

Alex: And if you get 10,000 readings...

Ann: That's gotten much better of the last few years. The first analysis is generally good enough to get somewhere, though we do simplify some sentences.

Alex K: That would be a very interesting to hear about for a few days, for the same reason that Dick said that the thing that failed at PowerSet was the semantics that everybody thought was very nice. There's an extremely limited set of people in the world who have have hands-on experience applying general purpose grammars to some task. The sort of challenges that you run into when you do that, it would be extremely educational for the rest of us.

Alex: We have experience using them using a general purpose grammar to get semantic forms for sentences and then using them as features to statistically learn the dialogue act (in Settlers dialogues: offer, counter-offer) and then using again features from the semantic form and the dialogue acts for discourse parsing. And that's where we get the full blown discourse structure with the coherence relations. We're not actually using the logical forms of the sentences as they were intended. That's true also for doing tasks like predicting what trade if any is going to happen at the end of a negotiation. Symbolically construct a model of the discourse participants preferences with the dialogue processing as input. A mixture of symbolic and statistical going on there, but critically we're not doing anything symbolic with the logical forms.

Dan: Then that's not relevant for today's discussion.

Alex: It is. It's a way that they're being used that needs to be acknowledged.

Emily: I taught a class last year asking people to do something similar with ERSs as sources of features for machine learning. Pulling the semantic dependencies out of ERS is better than bag of words for e.g. sentiment analysis. It's a way to reach out in our marketing, to build the bridge.

Dick: Textual inference is also not purely symbolic, as Johan and I do it. We use shallower features to make some decisions (e.g. skip bi-grams).

Alex: That's something that our data set also needs to cater to me.

Emily: Not clear to me that one and the same dataset can do that and the other

Alex: We can subcategorize the data set then.

Ann: Let's not kill this by trying to be too ambitious. Having that dataset would be a great idea and one that people would use (one supporting hybrid approaches to reasoning)...

Dan/Ann: But we're not particularly equipped to construct that data set.

Alex: We do have expertise.

Emily/Ann/Dick: No, others do.

Emily: For that particular bit of marketing we might be better off organizing a workshop where we can bring people together and show off this work.

Dick: For dialogues processing, there's ATIS...

Dan: So the use case that you're describing is certainly part of the marketing strategy showing people how the resources we're building can be used in maybe a surprising way. On the marketing front that makes sense, but it takes us a littel afield from the discussion of the dataset we are uniquely positioned to constructed that would allow us not only to compare our engines, but to demonstrate utility.

Dan: Is there more to say about that?

oe: Hopefully. I don't see it that clearly.

Emily: I really like the idea of operationalizing the discussions we have about is the planner the sleeper into textual entailment questions, but how do we connect that real-world data? Just searching our annotated corpora?

Dan: How about extending to dialogues, just a little bit?

Dick: I think taking paragraphs out of wikipedia, and designing the questions so that you have to account for the coherence of the paragraph in wikipedia.

Alex: Like a reading comprehension test in elementary school. E.g. getting at content that's in the paragraph but linguistically implicit.

oe: I think we've moved from textual entailment to machine reading, like CLEF. Set up exactly this way--a paragraph of text, and then a set of questions that the participating systems were to answer.

Johan: A similar, but different task. It's more work to do that.

Dick: If you want to take that kind of dataset on board, would need to go through and look at which questions rely on paragraph-external world-knowledge.

Dan: Here's the part of the task we could hope to provide illumination for.

Alex: I think that would be a fantastic exercise in educating potential users where the gaps are.

Emily: I'd be very surprised if the CLEF data had the kind of semantic-analysis-driven questions we were talking about generating.

Dan/Dick: No, we'd have to add those.

Johan: If you move to questions, the disadvantage is that questions are harder to parse and construct meaning representations for.

Dan: But to have them absent is also a mistake.

Dick: And that task doesn't have to be set up as questions. Could be statements with true/false as a question.

Johan: We should not make it harder for ourselves.

Dan: I'd like some questions in there.

Dick: Personal assistants, for example have a lot of questions and commands.

Emily: Could also ask Who is doing the planning?

oe: CLEF made people select among them.

Dick: Marketing problem: Often the linguistic inferences are going to seem really obvious to people. The way I think about it for question answering, is that the main part is actually question asking: what questions are likely to come up on the basis of what you've already got. It would be nice to find some way of wrapping that into this data.

Dan: For some set of those little stories, here are those 10 stories, here's how much you get without any tuning for the domain. To provide a measure of the customization cost.

Emily: But if we put in questions like who is doing the planning? people will say, duh, that's right there in the text.

Alex K: So we need to put in a baseline system that shows how bag of words doesn't find that obvious.

Ann: If you choose the text/domain carefully even those questions might look interesting.

Woodley: Be sure to put a lot of questions in there that look like they might be obvious, but they're false.

Alex K: What I like about this is that it doesn't assume a deep grammar around. Anyone parsing to AMR would appreciate this as a resource for external evaluation without having to worry about background knowledge. It should be a way of exciting parts of the ACL mainstream and finding some common ground.

Dick: Participating in ACQUAINT, PARC and a couple of other groups were trying to put in some more data for linguistically-based question answering, and everyone complained, but when it came to doing the evaluation, the only thing that anyone could do well on was that data.

Ann: For the politics of this we need to not just set this up with a group of people who've got deep grammars.

Dan: Would some version need to exist first before we can take the next step? Is that collection of data the result of some other process that we would participate in.

Ann: 20 examples or so within the next months, circulate those in our larger communities for additional ideas/proposals. And then see whether we've got some resources to extended it further.

Johan: I'd like to see this as a wiki-based, open RTE corpus where anyone can add examples and judge the old examples. Don't have to agree on an example, but just have to put your opinion in there. Need a wiki-based framework, not sure there's anything that would work off the shelf.

Alex K/Dan: Snapshot or curate subsets for standardized tasks.

Jan: What are the chances that this would be multilingual? My main focus is translation, AMR is also very interested in applying to multiple languages.

Dan: If the size of this object is 1000 pairs and not 10000, then the cost of translation is not enormous.

Jan: There is already a translated RTE corpus.

Dan: That's another aspect in which the wiki-based approach is promising.

Dick: That would be another motivation for taking paragraphs from wikipedia.

Emily: You have careful not to always start from English though.

Tim: A lot of interest in AMR is multilingual cross-document tasks.

Dan: As people with deep linguistic background, it almost goes without saying, but you're right to point it out.

Dan: Is there anyone who's been saving their remarks?

Johan: Another task: I'm building a corpus of images and true full statements in English. That would be another way to evaluate meaning representations.

Jan: [...] There are endless possibilities because there are no standard tasks. We did some work on identifying illustrative pictures for newspaper articles.

Dan/Jan: Pictures are great for multilingual tasks.

Johan: I put an example in my background notes. I want to do not just true or false statements, but also a logical model, with a domain of objects, properties they have and relations between them.

Alex: It's a satisfaction relation. The picture's the model.

Johan: And if the vision people want to create a model automatically, even better. But I need a vocabulary of symbols, so it's not really framework independent.

Dan: You can simplify by working not with arbitrary pictures, but with pictures that are systematically generated, as in the robot task. You don't much insist that those should be real snapshots of bits of the real world.

Johan: They should be, because that's what the vision people are interested in. State of the art is that object detection is pretty good, but not relations between them. That's where they need semantics. The vision people are now going to reinvent the wheel. QA over the images.

Dan: For the multilingual aspect, we'd need to translate those sentences.

Tim: Julia Hockenmaier is doing something similar. They got people to do descriptions like that, but they don't have the truth values.

Dan: This reminds me of a closely related effort: in US elementary school, there's a very strong emphasis on interpreting charts, graphs, etc (structured objects) etc for STEM. Is X true, is Y true, what's the value of Z?

Johan: My 4yo in preschool is learning Dutch prepositions in (English) preschool with pictures and is the book on the table?

Dan: I'm thinking that there is yet another huge population/market with the elementary school task. Need for training examples, automated evaluation, etc. (of student work).

Emily: The end-user community is big, but what about the community in between that is potentially using our technology to do this?

Dan: 140 companies besides the one I work for... The kids have to write out answers on their exams. A language-based strategy is critical.

Ann: You would that someone set up a standard explanation that you are checking against.


Silvie: We have decided to treat comparatives, and I'd like to take the liberty to start with our framework. I hope you'll like it, because it's one of the few isolated islands in the tectogrammatical description where syntax doesn't matter so much and we really fabricate things you might like as semantic analysis.

Silvie: The dog was older than the cat. Whenever you can hallucinate a comparison predicate, you have to fill it in. #Such is coreferential with old, cat is the ACT of #Such. Our comparison is both likeness and unlikeness. We would do the same thing with is like. If there weren't the property like old, we would have to insert something artificial as well, which would have a #coref to something.

oe: What would be an instance?

Silvie: The dog is like a cat.

Emily: And you don't know in what property the dog is like the cat?

Silvie: We'd have to hallucinate the property in two cases.

oe: Before we move away, can I make an attempt at paraphrasing what I see? This part means the dog is old.

Silvie: With old having info inside meaning it's comparative.

oe: And this part here means that the cat is old, and there's a comparison of the degree of oldness between these two states of affairs.

Silvie: Yes. This is one of the most sophisticated templates in our annotation. Comparison is one of the most common adverbials, and when you extract adverbials from predicates, you have to think of comparison separately in your search terms (because it looks different from other adverbials.)

Jan: On the other hand, the fact that we do the #coref arrow explicitly allows us to distinguish between John hung the picture higher than Paul --- than Paul is, or than Paul did. And there could be even more complicated things. Which from the syntax alone you can't tell.

oe: I believe that is what we wanted to be exemplified by this example I fear it more than my brother

Jan: Yes, there will be two analyses, and you have to decide from context which one is the right one.

Silvie: I decided to interpret it as more than my brother fears it. There's no way to underspecify this.

Alex: For the other reading, the arrow would go from my brother up to ...?

Jan: It would go to it, but brother would be the PAT.

oe: And no #Such in this case.

Silvie: Because there is no property in this case.

oe: And the extent of fear isn't that property.

Silvie: There is no way to insert such thing.

oe: I was prepared to paraphrase this as I fear this to a higher degree than my brother fears it.

Jan: There's a semantics behind these hashtags, where #Coref is a thing and #Such is a property.

Silvie: #603: We insert nothing there. #604 would be the same as the first two. And all of these are resolved in this way.

Tim: AMR had a similar set of choices to make. You can treat the dog is older than the cat as the oldness of the dog is greater than the oldness of the cat. We tend to not do the oldness of the cat things in favor of just the cat on the assumption that someone will work it out. I like the more elaborate version in these simple examples, but in longer ones you have to do a lot of hallucinating. The other ugly thing that was pointed out in these paraphrases, this implies in this structure that the dog and the cat are old. Ideally we'd have to veridicality or something over all of this but right now it says they're old.

Alex: It depends on what happens when you interpret this in a model theory. The degree operator could be a modal.

oe: That was going to be my answer to Johan if he asked the question like I was setting him up to do.

Jan: The degree attribute which says that this is not old, this is older. AMR has this intentionally missing.

Tim: AMR does the dog is old and the oldness of the dog as a similar relation between the concepts, but the dog is old is making much stronger assertions than the oldness of the dog and we're not capturing that difference.

Silvie: What if it's the dog paid 3000 more than the cat?

Dan: The dog is one month older than the cat. Even putting in the measure, you're still saying the dog is old.

Silvie: Back to quantities, if we don't compare properties but quantities, how would you resolve that? Would it be a similar structure? With degree?

Tim: Yes, with a degree operator and a compare-to operator.

oe: And does More dogs than cats appeared illustrate this?

Tim: If it were 10 more dogs than cats appeared, I think it would be more :quant 10, but I'd have to look into that.

Johan: I don't know how to do these things really. What you want to say is that older and eldest and old are at least related to each other. The meaning for the comparative, superlative and positive should be similar to each other, and I think the hardest of the three is the positive case. I don't know what to do for the semantics of old. The best I can do is old means older than most of the people in the domain. But then we get the semantics of old in terms of older. Morphology says that old should be the base case. But as we said before, the cat is older than the dog doesn't mean the cat is old.

Dan: You'll do it like that for any adjective?

Johan: Gradable adjectives have dimensions; here it's age.

Dick: For any gradable adjective, you want to take the grading being primary, and have the grading pick out points. I actually think it's reasonable to have that ordering as primary.

Alex: You didn't treat this like you treat your copulas. You didn't do be-older.

Johan: It's just a symbol.

Alex: Oh, it's like your predicative guys.

Ann; And you don't want to do it with just putting a numerical scale on everything and use greater than like Chris Kennedy does?

Johan: Then you'd need it in your bg knowledge as well.

Dan: Are you trying to capture a difference between old/older and greedy/greedier? Mary is greedier than John --- she's gotta be greedy, doesn't she?

Dick: Everyone is greedier is that St. Francis of Assisi.

Dan: So he doesn't have to be greedy, but everyone else is.

Alex: No!

Ann: In the null context, you could not say that everyone is greedy given that context .

Dan: Once you put St. Francis in the mix, everyone is!

Dick: The hermit on the hill is not at all greedy, but he's greedier than St. Francis.

Alex: I think you're mistaking something like a scalar implicature for entailment.

Dick: My baby is one week older than yours.

Emily: Can is trying to show that greedy and old might be different.

Alex: There's an implicature, but you can cancel it.

Dan: So we're saying that all adjectives have the comparative as the basic semantic form?

Dick/Emily: All gradable adjectives.

Ann: That gives you a beginning of a story about how you modify adjectives (quite old).

Dick: Also bear in mind what happens if you've got adjectives comparing things that are incommensurable. This student is taller than he is clever. The ordering seems to be what's absolutely crucial to understanding comparatives like that. It seems to drive the semantics of adjectives, but not the morphological form. For things like old there's a notion of a comparison class (fossils v. people) and usually that's implicit. But with greedy there's no standard notion of greediness for a human. For things like age, the bg knowledge will tell you that anyone below 20 is young, anyone above 60 is old...

Emily: Thank you.

Jan: We were discussing the comparative construction for quite some time and concluded to do it that way, because language seems extremely mechanical when talking about comparative. I agree that the base form is the base thing. But whenever you have something that is gradable, people apply the comparatives in a very mechanical way. older/younger are simply opposites of the same thing. It suddenly abstracts from everything and only compares to numbers on a scale, and doesn't say anything about who is old, young etc. That's why it looks strange. That's why we did this very formal construction, because we thought it would almost always fit.

Dan: The selfish worry is in trying to figure out how to produce a compositional reading where there is more beautiful, built syntactically (compare lexical older). I want to build a semantics for beautiful that doesn't get in the way of more beautiful. What I really ought to be doing have the lexical entry make no such commitment, and then have a unary projection that creates the positive adjective in the absence of the degree specific.

Dick: I did that by decomposing them into two parts. Old modifies this thing, and there is also a degree specification on old.

Dick: This analysis was partly constrained by what we were trying to do in PowerSet where everything had to be represented as a two-place relation that you could index. What you will see here is that for the cat old is modifying it, and there's a degree for this oldness. If it's were just the cat is old, the amod of cat is old and the degree of old is number, and that gets removed (default assumption). Three years old -- degree is three in terms of years. Here, what's written semantically doesn't make sense because of the semantic index pressure. Degree with amount as argument, and then amount_gt/amount_lt for dog and cat. I wouldn't claim that is the proper semantic representation, just a contortion. Here there is some amount of oldness, which you don't want to interpret that as either one being old.

Alex: Would you do later like this? I switched the oven on later than the microwave. If it's part of a discourse so you have some kind of Reichenbachian reference time for those events, then you could do anaphora resolution to get that amount and then you're predicating things about it. You've got a little slot for that.

Dick: That's possible. The other thing I'd want to say about that is that kind of degree, we're saying it's an amount, but we're not saying it's a measurable amount, could be just a rank in some kind of ordering. John is 5th in the ranking of greediness, Mary is 17th.

Ann: Is that enough though? Imagine a context where there's just Kim and Sandy. Kim has eaten five pizzas, and Sandy has only eaten once, Kim is much greedier than Sandy.

Dick: You may still have a bg comparison class of people, so Kim is much higher in that ordering of people, even though you're only explicitly mentioning two.

Ann: Methuselah is much older than the average person alive today.

Dick: much Xer is pulling in an additional comparison class for what counts as a big difference.

oe: Are these skolem guys the entities?

Dick: Concepts, in my terms. But it's hard to interpret... you would normally want to say there's two different oldnesses, that of the cat and that of the dog, and the dog amount is greater than the cat amount.

Johan: What would the dog is old look like?

Dick: Amod on dog, and in the background there would be adeg of oldness is normal, whatever that means, except for presentation you'd drop it.

Johan: So if 10 years old overrides the normal, then that would be non-compositionality?

Dick: You could probably put together a compositional analysis. It would be fiddly to do, but I think it's doable.

Johan: I think you need something like that, because there's this common interpretation but can be overridden if you supply it.

Tim: The student is taller than he is clever---two adeg there?

Dan: Bresnan has some tests to show that there's a missing degree modifier of clever there. I think it's an illusion to think that only the first one is a comparative in that example. Then you're still in a position to get the non-normative comparison.

Emily: Just to be a troublemaker, let's take the shortest person in the class and the teacher says: He is taller than he's clever. Bottom on the ranking in both cases, so?

Dick: Maybe a larger comparison class.

Alex: Really complex speech act.

Ann: Maybe tallness is zero, cleverness is less than zero.

Dan: Or jump to the next student in intelligence is bigger than the jump to the next in height.

Alex: Children's show purpose. For booking purposes, Abrams is old --- you want that normal to be replaced.

Dan: I'd have to say too old

Emily: Isn't enough that Dick's representation says degree is normal, but the discourse processor can pick out what the normal is in context? It's ... non-specified

Dick: Yes. Or underspecified. Take your pick.

Silvie: What about if we don't have the other referent, like The dog is much older.?

Johan: Let's see the ERS.

oe: I see parallels here to both the Prague structure---can be interpreted in Dick terms. We do say old of dog and degree of oldness, in comparison to the cat.

Emily: But we don't say in comparison to what about the cat.

oe: AMR wasn't copying that property either.

Johan: You say the dog is old, right?

Dan: Unfortunately you say the dog is old.

oe: To some degree

Dan: I'm with Johan here. If you have a degree, then it's not actually old.

Emily: But comp_rel is a degree.

Dan/Dick: What I'm missing is a degree normal.

Johan: How would you do that in the grammar?

Dan: old comes in with a specifier slot, but if I discharge that specifier without putting anything in it, that's where I put in the positive.

Ann: I wonder if there's another way of doing this by cheating: observe that all we've said about old, though we didn't bother to put it in, is that old means old with respect to p, p filled in by context. I wonder if one could in a rather horrendous way say that when the context is the comparative, that's doing something to that p, which is preventing you from saying anything about the absoluteness of that oldness.

Dan: So you're pushing it to the interpretation of that p, which we have to do anyway because this elephant is small.

Woodley: There's no such things a small, globally.

Emily: Higg's Boson?

Woodley: We don't know what's smaller.

Silvie: The dog is much older and we don't know with respect to what. There's no comparative there. It looks like the dog is very old.

Emily: There's nothing inside that node for old that says that it's comparative?

Silvie: Only the morphological information.

Dan: You could have distinguished two kinds of degree specifiers much v. very because much doesn't go with the positive adjectives so that would let you tell.

Silvie: How?

Dan: It assigns a different label somehow? What I'm objecting to is that you wouldn't see in English *The dog is much old.

Silvie: We have the information about the grammatical degree of the word old, and that's enough to reconstruct the sentence correctly.

Jan: In every PDT representation, there are 20 attributes that we aren't showing you, as opposite to AMR where what you see is what you get. Three types of modality, etc. They're an inherent part of the representation, without which you'd be losing the difference between the sentences. The much, except for the syntax, is orthogonal to the fact that there is comparison. But it could also be older by two months. It's just a convention that we don't have the CPR construction, because it would be full of artificial nodes. The guidelines say if you have nothing in the sentence, you don't hallucinate it at all.

Emily: But isn't the -er the evidence that you need to hallucinate all that?

Jan: In theory we should have included the whole construction, with info from context. But the guidelines convention is, if the sentence doesn't have it, we don't put anything in. Leave it to pragmatics. But if there is one item then we put it in and carefully add the rest.

Emily: My point is that older is that one item.

Jan: For me an item is a separate node in the tree.

oe: Are you now in the business of encouraging other people to hallucinate more?

Emily: I'm just being a troublemaker.

[oe makes troublemaker name plate]

Silvie: Can we see the ERS?

oe/Emily: The much is the degree specifier of the comparison, but the comp_rel is the degree specifier of the old.

Silvie: You don't need two arguments of comp_rel?

Dan: It's unexpressed.

Silvie: But it's valid like that?

Dan: Yes. Just like a transitive verb with an optional argument or a passive without the by phrase will still have the semantic positions for the dropped arguments. I don't want to change the arity of the predication just because there isn't an overt supplier for that piece of info.

Dick: This one's a bit screwed up in XFR. But not as much as The dog is older than the cat is. Clearly the parser doesn't know what to do with elliptical ones, and took is as an abbreviation for Isaiah. In comparison to that, the one below doesn't look too bad. The amod degree is greater for the dog than the null pronoun. So you got the ordering of the difference, and then the amount diff is the difference in the mount. This is weird in trying to use the much in trying to refer to both the amount and quantity of the difference.

Alex: Where's the null pronoun?

Dick: It isn't there; it should be there. It was an attempt to try not to introduce too many terms when you're trying to index. That kind of constraint really messes things up in the semantics.

Emily: Sounds similar to the ERS analysis.

Dick: Yes.

Johan: In DRS... [asks oe to fill in the missing pronoun in the boxes.] And otherwise the rest is the same as before. Of course I'm not happy with the much, but I don't yet know how to do degree modifiers.

Emily: Because you don't have an explicit comparison...

Tim: In the AMR we don't inject the comparison set.

Emily: And that's because it's not there, so you can't put it in.

Silvie: So more is an artificial concept here.

Dick/Dan: It's coming from the -er; normalizing over older/more beautiful.

Dick: Also: you can have much less old.

Silvie: Moving to the next sentence, let's show mine again.

Tim: What would you do with the dog isn't an adult, much less old

Emily: That's a different much less.

Tim: Like let alone.

Dick: Yeah, probably a MWE.

oe: You make that a tall tree...

Emily: But is it taller than it is clever?

Jan: I think the CPR label should have been with a copy of appear. And cats should have been the argument of that.

Silvie: I should have copied the entire structure as well.

Jan: We have less freedom than AMR, but there's still some freedom. In certain nodes there are two options: (i) insert things that are not overtly displayed on the surface, (ii) insert nodes that we know exactly what they are, and that's what we call a copy, and copy most of the info from some other node. Very often in coordinations, where the syntax is broken. Comparisons are another frequent example. The Democratic party got 17.7% of the vote, Republics 25\% of the vote.

Silvie: I just mixed it up with quantity like more than 5 people appeared, but this should be like the cases before.

Emily: Still being a trouble maker: Would this be true if no cat appeared?

Jan: True? That's not usually something we care about.

Alex/oe: We can do the same thing as before with the interpretation of the comparative/degree specifier.

Tim: AMR would have two appears kind of like yours. "Dogs more compared to cats that appeared." We don't make the quantity explicit for the cats, which could be problematic.

Alex: Why? When it's the comparative set, there's nothing stopping it being zero.

Woodley: It could be compared to (q / quant :quant-of (c / cat)) which would be ugly and we don't do.

Alex: And you don't have to, because it all depends on what the model theory does with the compare relation.

oe: But this is a distinct appearing situation?

Alex: He's allowed to copy, not hallucinate.

Dan: Are you troubled by the weirdness of *More dogs than five cats appeared. Again the Bresnan argument that there's a necessarily missing quant specifier for cats. So you should put in the quant value

Tim: We would do it the richer way if we got More dogs than the small number of cats

Emily: Not a problem for the Bresnan analysis because it's the number you're comparing to.

oe: More dogs than those few cats appeared?

Emily: Star.

Woodley: More cats than the small number of Siamese my neighbor keeps appeared.

Emily: That one is pretty difference, because the Siamese are included in the group of cats.

Woodley: Syntactically very similar, though.

Johan: Look at that.

Emily: It's one long skinny box.

Johan: This is the collective interpretation. There's a set of things appearing, the set including both cats and dogs appeared.

Woodley: What do you do when it's more rice than sugar is in this dish?

Dick: The cardinality can count the number or the amount.

Woodley: So you could have a set theory where that could be a non-integer?

Ann/Dick: Yeah.

Woodley: That's new to me.

Dick: This is nice because it doesn't commit you to any cats appearing.

Johan: It's also not what Boxer produces.

Emily: This looks like a nice analysis of my interpretation of Woodley's Siamese example? Is there a reading for the other one.

Johan: The distributive one? Like when the cats appeared yesterday and the dogs today. Then there would be two events.

Emily: In that case is there an entailment of a non-zero number of cats

Dan: More dogs than unicorns appear on the streets of Berlin.

Dick: Do you really want to be able to refer to the numbers?

Johan: Yeah, not all things in the top box are discourse referents in that sense.

Dan: More dogs than cats appeared. It was huge.

Emily: It can't refer to the number of dogs, or cats, or the size of the difference.

Alex: More dogs than cats appeared. It was a huge difference. is okay.

Emily: Yeah, hmm...

Silvie: And ERS?

Dan: So you'll see this ugly much-many predicate (to work with both rice and cats) and the comparison is still the degree on that manyness, and again sticking the cats right into the comparison, not even telling you that it's the number of the cats, but letting you do that interpretation.

Emily: Are you not doing what you said Tim should do by not saying anyting about the number of cats?

Dan: I'm saying even less than Tim did --- only one appearing situation...

Alex: This will undergenerate. If you've got 100 cats in your model and 0 cats but 3 dogs appeared, this would come out false.

Dan: But we've said that this comparison guy is an operator that can perform magic.

Emily: At that point have we lost some compositionality if we're asking that much of the post-processing step?

Alex: That's why it's nice to have a big post-processing step, because it allows the grammar to be compositional. It allows the syntactic constructions to behave themselves. I feel the same way about the presuppositions. Don't worry about doing that in the grammar.

Dan: Then it's an apples and oranges comparison, because we're only doing the first third of the work compared to Dick.

Alex: Dick has lexical resources etc.

Dick: It's not just the lexical resources; those could get folded into a compositional approach. It's that there's some inference going on over the lexical resources.

Ann: Do you think it's right to say you can never copy something in a strictly compositional approach?

Dick: No.

Dan: I even did that for a year in gapping constructions.

Dick: I'm pretty sure that if you're trying to get the distributed reading over subject NP coordination compositionally, you could do something to copy the VP.

Johan: I do a lot of copying. You can't just use unification, you need to rename variables. To go back to the example, what is missing here?

Alex: The appear for the cats.

Johan: Why not treat it as a generalized quantifier?

Ann/Dick: I like that way of doing it.

Dan: Say again?

Johan: Introduce one variable for both of them, for the one set.

Dan: Like for coordination. Interesting.

Emily: But that's the collective reading.

Dick: Cumulative, not collective.

Dan: We claim that we can provide either collective or distributive reading for coordination---that's what Rui Chaves did, but there would be enough info in there to make the decision down the line.

Dan: This was worth the trip.

Silvie: Still time for one more sentence. I would vote for more aggressive dog than mine. Here we have a simple structure, so we don't want to insert be. We underinterpret it here.

Alex: You're not even saying my dog.

oe: It could be to my cat.

Dan/Emily: No, it can't.

Silvie; It's just a short-coming of our representation---the English part, since this wouldn't have emerged in the Czech part.

Tim: In AMR, we don't copy the aggressive, but hope you can get it in post-processing, and just copy dog possessed by "I". And we don't copy appear because my dog didn't appear.

Johan: Didn't do that one...

oe: In the ERS, the degree of aggressiveness is in relation to some entity.

Dan: Which is not related to dog anymore.

Emily: So we have the same shortcoming as we saw in the Prague representation.

Alex: It would be a pretty non-compositional rule that would have to dig around for that dog.

Dan: So the consensus is that that's a pretty grammaticized link.

Alex: If you treat this as a generalized quantifier ... that wouldn't help.

Dick: There really is ellipsis going on.

Dan: This is one case where it would be awfully nice to be able to copy.

oe: Could do that in principle.

Ann: But the predicate name... A more aggressive three hundred pound African gorilla than mine appeared.

Dan: I'm going to be really impressed if you tell me that is 17 ways ambiguous.

Dick: If you're treating it is as ellipsis there's always the question of how much to copy.

Alex: Maybe it is pragmatic then --- what's the scope of the antecedent? But what's grammar driven is where in the structure you can pick it up from.

Dan: A more aggressive three hundred lb gorilla than John's of mine...

Ann/Emily: Star.

Dick: We need to get a handle on how ellipsis resolution is handled. You don't want a separate solution.

Ann: I was just objecting to the just copy the predicate idea.

Dan: Yeah, I agree it's part of that larger problem.

Emily: It's a three hundred pound aggressive unsolved problem.

Dick: In XFR: The dog appeared, the null pronoun doesn't, but it doesn't restrict the null pronoun to being a dog, so it's like ERS.

Dan: That's reassuring, because if it did solve that problem, you'd have a general solution for ellipsis.

oe: Thank you Silvie. A good choice of final phenomenon. It has a rewarding degree of richness, depth.


oe: Let's give everyone the chance of making a closing statement about what your'e taking home, what you woud like to do next, what you would like to see happen next.

Woodley: I'd like to see some of the veridicality stuff show up in the ERG outputs.

Dick: Lauri has made a lot of those factive etc information available. I'll try to make some of my other resources available more broadly. Other similar approaches being able to take advantage of this is clearly a gain for everyone.

Woodley: Is there a small portion available now?

Dick: The basic info for doing veridicality for a number of verbs, yes. The other things would be like argument structure for nominalizations.

Emily: In the very near term, I plan to try to systematize what I've learned about compositionality here for a set of slides for Tuesday.

Dick: I would love to see something like this happen again with a similar level of attention to detail. There's lots more that could be covered. We've kind of begun to understand the other approaches, and could build on that. I'd like to also see us begin to gather together data for developing and testing what we're doing.

Woodley: Future meaning just like this one, one looking at implications, or unifies that?

Dick: One that unifies that---look both at representations and implications between them.

Alex: I'd like to understand better what we discussed this morning, the gap between what comes out of the grammar and users like me jumping off that cliff and going 'wheee!' into complete AI. It's still not clear to me how big that gap is. I'd like to see at a future meeting real focus on that. Getting this data would really help with that. Would be really useful to me as the token end-user.

oe: Time, we invited you to something that would be potentially a little brave, to present something new to a crowd strongly rooted in the logical/linguistic tradition.

Tim: I loved being here and appreciated that, because AMR has this attitude of being spun out of nothingness, where it really does have deep connections to these grammatically principled things. Having awareness of those connections helps a lot. The ideal AMR---seems like you do all this grammar stuff, and then make a bunch of additional jumps, and then look at AMR> But there's still things like scope where AMR seems to lose a lot of info. The projected goal is figuring out what of these we can at least approximate.

Alex: I have hidden agenda: I'm getting really pissed off with lots of dialogue that assume that the hypothesis space is complete. Random variables that represent all the states you'll ever be in, a complete set of actions, a set of preferences, ... completely pre-defined and static. But actually conversation is not like that at all. There's information exchange in the dialogue where you're learning what the possibilities are, so you have to adapt your decisions as you're conversing. Not reinforcement learning, this is that hypothesis space changing when you play that conversation once and only once. What you need is a model of decision making that can adapt to a changing vocabulary. An underlying qualitative model with a symbolic vocabulary, a logic, a model theory. When someone says something out of left-field, you've got to reason on the fly about what he's doing, for changing the qualitative model of the game. Where the info coming in is inconsistent with hypothesis space, you've got to down-date the model ala AGM. That's where logic really has bite.

oe: Once source on inspiration was the STEP symposium you organized, Johan.

Johan: I really enjoyed this, but personally, I'd like to see more of the inference kind of stuff. It doesn't mean you have to be able to do it, but explain how you would do it. I think it would be great to have a way for the community to make a testsuite that has added value. Need some server somewhere, I guess...

Alex: I'd like to see a pilot experiment where we build an agent who has a deficient model of what conversations might happen who can learn from it by doing that inference.

Johan: That's ambitious. THat's one reason I work on text, not dialogue.

Jan: I have a suggestion --- why don't we do next time, a Dagstuhl seminar, where we could invite more people. First you meet together, then you have several parallel sessions, then Day 4, 5 you come back together. There could be a dialogue section, an inference session. Not much bigger than this, but something like that. That would fit the idea of the Dagstuhl seminars.

oe: And not require the support of any national government...

oe: A Dagstuhl seminar on applications of computational semantics? Candidate title.

Johan: Not sure these are applications...

Ann: I like very much the idea of doing the inference stuff, I like the idea of starting to build up some set of examples and so on. As a Dagstuhl seminar, I'd very definitely say we'd have to have productive outcomes during the week. And there's one other thing that we haven't mentioned ... those slightly sarcastic comments about what we tell our PhD students to throw away. There's a set of applications where you look at the semantic representations as giving you structural information. There's probably a way of learning things from the other representations which is what sort of structural information do you want when you're doing those sort of tasks. Also potentially the feature-extraction type things. That's not at all the same as let's go deeper and work out how to do inference. Using it as a way to find relatively language-independent, multilingual structure.

Dan: It would be really nice to have a few different languages represented.

Ann: I think the next meeting should have some other languages represented.

Silvie: Very little to say. I'm just an annotation person, very deeply immersed in one particular framework, but I'd gladly take on any annotation work you have for me. [Negotiated through Jan.]

Dan: I think that construction of a new dataset and the challenge of trying to work towards the inference, the more outward-relevant level of representation would be illuminating. I think if we're gonna have such a session, tehre would be a longer lead-time and homework cycle involved because in order to even try to do it I'd need to work through things and find out what's hard in my pipeline and what the shape of the missing resources is. That would be a steep education curve. I have for a very long time thought that cleverer people would take the grammar's output and then take off with it. It would be fun to try and actually walk down that road a ways.

Woodley: Down the inference road.

Dan: Yes. Because that's a lot of what I think I'm trying to support in what I'm trying to do.

Woodley: It's a nice hard problem.

Dan: And if you, Woodley, were to do something about pronoun resolution along the way, that would be good. I think aiming for another even with not too many more people/egos in the room that would be good. With a bigger room, we could fit a few more.

oe: I don't have a deep enough training in the application of mathematical logic to NL semantics, but I feel like some of the excitement in efforts like AMR comes at the risk of severing some ties that I'm not prepared to see severed completely. I'll try to work out for myself that comparison matrix. More concretely, I have an incoming PhD student and I'm very inspired by observing Dick and Johan specifically going further than we have traditionally done. I'm inclined to try to use that PhD project to look into disambiguating e.g. the relations in NN compounds.

Ann: Could just slot in the existing work. Should fit with ERS. But I might be wrong.

oe: I was thinking int he compound bracketing, compound roles, interpretation of the possessive. I'm inspired to look into those NP-internal, underspecified roles that we produce that other parsers might leave even more underspecified, and try to bring together and unify some of the existing proposal and see how that actually scales. Whether and how.

Woodley: As an operation that takes ERS and refines it.

oe: Yes, starting to build something after parsing. Possibly also after parsing with other engines.

Dick: I'm a consumer of two parsers.

WeSearch/CcsDayThree (last edited 2014-11-15 15:33:48 by EmilyBender)

(The DELPH-IN infrastructure is hosted at the University of Oslo)