Syntax in Computational Linguistics
1) Introduction to Syntax
Syntax is the study of how words join together to make
sentences.
It tells us:
·
Who is doing the action (subject)
·
What the action is (verb)
·
Who or what receives the action (object)
Example: John visited Mary.
·
John
= subject (doer)
·
visited
= verb (action)
·
Mary
= object (receiver)
In
computational linguistics, this can be written like a small formula:
Visit (John, Mary)
1. Why
“Mary visited John” means something different
In coding
terms:
Visit(John,
Mary) → John visited Mary
Visit(Mary,
John) → Mary visited John
Even though
the words are the same, changing the order changes who is doing the action.
Humans
understand this naturally.
Computers need rules (syntax rules) to figure this out.
So, syntax
gives the structure of a sentence, helping computers understand
language.
2) Basic Syntactic Concepts
a) Subject – Predicate Relation
Every sentence has:
·
a
predicate = usually the verb
·
Subject = the
people or things involved (subject, object)
Different languages show these roles differently:
·
English uses
word order
o John (subject) → visited (verb) →
Mary (object)
·
Japanese uses
case markers
o John-ga (subject marker)
o Mary-o (object marker)
Computers must understand these patterns.
b) Phrase Structure
Words group together into phrases that act like one
unit.
Example: the tall boy from the park
This whole group = one noun phrase (NP).
Syntax studies:
·
how
phrases are built
·
how
they can be expanded
·
how
one phrase can contain another
c) Ambiguity
A sentence can have more than one structure → two meanings.
Example:
The man from the school with the flag.
Who has the flag?
·
the
school?
·
or
the man?
Syntax helps computers detect and solve such ambiguities.
3) Agreement (Dependency)
Agreement
means words must match each other in number, person, gender, etc.
Examples:
✔ This boy is tall
✘ These boy
is tall
Problems like this help computers check grammatical
correctness.
Example of ambiguity:
Flying planes seem/seems dangerous.
·
If
we use seem, “planes” is subject.
·
If
we use seems, “flying” becomes the subject.
Agreement
helps decide the meaning.
4) Valency (Subcategorization)
Different
verbs need different numbers of arguments.
·
Intransitive =
1 argument
o He slept.
·
Transitive = 2
arguments
o She ate an apple.
Computers
use valency to check whether a sentence is complete.
5) Embedding and Long-Distance Dependency
One sentence
can be inside another sentence.
Example:
The girl that John visited left.
The embedded part that John
visited depends on the main sentence.
Deep embedding is difficult for both
humans and machines:
The man who said that the woman who knew the teacher who
criticized the scholar left early…
Syntax tells
computers how to track long-distance relations correctly.
6) Conclusion
Syntax is
the architecture behind meaningful sentences.
It explains:
- how verbs control arguments
- how phrases combine
- how agreement keeps grammar
correct
- how ambiguity arises
- how computers can resolve and
understand sentences
Modern NLP prefers feature-based
and dependency-based models. Without syntax, computers cannot understand
or produce meaningful sentences—they can only list words.
No comments:
Post a Comment