Zero Pronouns in Tamil and Malayalam
The closely-related and typologically similar languages Malayalam and Tamil both allow
subjects and objects to be dropped. Both languages are discourse pro-drop rather than agreement
pro-drop, meaning that dropped arguments are not licensed by overt agreement marking, but by
discourse factors.
Tamil and Malayalam do interestingly differ in that Tamil has much more agreement marking on
the verb than Malayalam. Verb suffixes in Tamil provide person, number, and gender
information, except for negated, modal, or dative-experiencer subject verbs. Malayalam verbs,
however, lack overt person, number, and gender agreement.
Despite that agreement marking on the verb is not required to license zero pronouns in either
language, the present study asks whether there is any reflex at all of the recoverability of the
dropped entity provided by agreement markers.
The questions of interest are:
(1) Are zero pronouns more common in Tamil, in which agreement marking provides
recoverability information about the dropped entity, than Malayalam?
(2) Are zero pronouns evident in cases in which recoverability of the antecedent based on
discourse constraints is difficult? If so, is this more often the case in Tamil than in
Malayalam?
Centering Theory[1] provides an algorithm for addressing these questions. Centering Theory
explicitly defines the notion of "topic", or what an utterance is about. It also allows prediction of
what the topic of an immediately following utterance will be, based on a ranking of the salience
of the entities in the current utterance. The idea is that speakers modulate the attentional state of
their listeners by exploiting such linguistic devices as grammatical position (e.g. subjects are
more salient than objects). We expect that arguments will only be dropped if the entities they
refer to are currently highly salient for the hearer.
Based on a study of zero pronouns in Hindi[2], I predict that discourse constraints will require (at
least) that a dropped argument was both the topic of the previous utterance, and projected to be
the topic of the current utterance. I conducted a pilot study based on a Centering analysis of a
Tamil corpus (from the Central Institute of Indian Languages); this prediction was met for 42 out
of the 45 utterances with zero pronouns. A similar analysis will be conducted with Malayalam
data in order to shed light on questions (1) and (2) above, and to contribute to the understanding
of the role of discourse factors in licensing grammatical choices in Dravidian.
References
[1] Grosz, Barbara J., Aravind K. Joshi, and Scott Weinstein. (1995). Centering: A framework
for modeling the local coherence of discourse. Computational Linguistics, 21(2): 203–225.
[2] Prasad, Rashmi. (2003). Constraints on the Generation of Referring Expressions, with
Special Reference to Hindi. Ph.D. Dissertation, University of Pennsylvania.