Skip to content

SlavicMatrix

AntskeFokkens edited this page Jun 28, 2011 · 1 revision

This is a page about the Slavic Matrix developed in Saarbruecken and Sofia.

Here are the notes from the Subgroup activity taking place at the Suquamish delph-in summit:

Varya's grammar: phenomena to be added soon. Yi: what has been done on your side? What are your ideas... Varya: quesiton on embedded clauses, and main interrogatives. It's small (70 items lexicon), we also cover negation. Those were the three things you're interested in. embedded clauses: complements of "think" or "ask". Yi: Case? Varya: Genitive negation is not there yet Yi: Tania would like a more thorough investigation to the phenomenon. Tania: Tense and aspect in Slavic languages: commonalities, but there are also commonalities. Yi: Of course the aspect: general Slavic is important, there must be decent coverage of individual languages, so that we can do treebanking Emily: can we do a workflow where this is marked? Pan Slavic considered or not? Yi: Name space concept on types? Needn't be adaptation of formalism: can be a convention of prefixes in names...from our side: this is only part of the hierarchy, it is hard to keep track Dan: the name space idea is appealing to me too: virtue of file organization: relatively easy to manage, easy to maintain: problem: how do you decide when to move AND no longer present as soon is loaded in LKB, suffix better than prefix Emily: ISO code also has family codes Dan: you'd need a hierarchy for cross-classification? Emily: bigger question: code sharing between languages smaller than entire family: sub-families? Antske: Should be possible with the metagrammar engineering: automatic suffix adding should be easy enough: could capture sub-family properties Emily: realize that it can be done as well as grammar is already advanced: starting with big chunks, than back out parts Yi: compatible with Tania's work way: try out then decide...

.....

Dan: possible problem: what will Petya think? Is it worth-while: additional research question and brittle method being involved...methodology training... Emily: Can be done only at SB-side, then just porting slavic.tdl: would mean less feedback from Bulgarian.

Emily: separate set of questions: joint test suites. Test-suite is rather small... Yi: Do we have a larger test-suite? Tania: we do have the idea to get a joint test-suite: we need to get morpho-syntactic test-suites, we have one for Polish, Petya works on Bulgarian, and we need this too. We'd like it parallel between Slavic languages, we don't have it at the moment. We can talk to Varya... Yi: How about getting Treebanks? The little Prince Emily: there is some appeal, but Francis schooled us on license issues. Varya: 13 chapters annotated for information structure and tense and aspect. Tania: but not parsed? Emily: I told them no phrase-structure annotations, because the grammars will do that. Yi: Collaboration with Moscow people who gave us a very large dependency treebank. Joshua: there is a three letter code for slavic languages Yi: underscore or hyphen... Dan: I suggest using hyphens everywhere, or at least do underscores everywhere...has the meeting taken place?

Tania: it will be in summer Emily: can we make additions? Tania: yes Emily: should we use a shared repository? Yi: we'd like to keep hosting it in sb, it is already been shared with Sofia Tania: is little prince a focus? Are there special phenomena of interest? Emily: useful for information structure annotations which we are interesting in. What phenomena? Varya + Songhoun: First person narrator, lot of vocatives Tania: additional task: get vocabulary for little prince Yi: work on automatic acquisition Tania: but not necessarily that of the little prince... Emily: we would need to get the vocab together, but not just focus on this text Tania: Preferable not a translation Dan: Can you look for a short story, not too long easy narrative, easy, ideally one that has correspondence in other Slavic languages. Tania: Parallel corpora would be great Dan: Parallel corpora with Russian original resource must be in Russian Tania: possibly Communist Manifesto Agree: seems to be a good idea, lot of languages, open source by now...government supervision? Emily: short text, not parallel, there is always wikipedia. Dan: it is a particular genre, that misses certain phenomena. Varya: Sherlock Holmes? Francis mentioned, translated in many languages... Emily: all languages do also need some authentical text Varya: Tania, you were asking about the things observed in Little Prince? Pro-drop Tania: there are a lot of fragments.. Emily: they are independent Varya: Word order in terms of information structure could be interesting, but it is mostly neutral

Clone this wiki locally