Lviv National University named after Ivan Franko
Lecture 2
 Scientific standard – theory – method
theory and model
ways of approaching data
verifying process
Methods of analysis used in lexicology
 Immediate constituents analysis
Immediate constituents analysis
Immediate constituents analysis
Immediate constituents analysis
Immediate constituents analysis
Immediate constituents analysis
Distributional analysis
Z. Harris
Z. Harris
Z. Harris
Distributional analysis
Distributional analysis
Distributional analysis
Distributional analysis
Distributional analysis
 Distributional analysis
Distributional analysis
Distributional analysis
Transformational analysis
Transformational analysis
Transformational analysis
Transformational analysis
Transformational analysis
Transformational analysis
Transformational analysis
Componental analysis
Componental analysis
Componental analysis
Statistical analysis
Statistical analysis
Statistical analysis
Statistical analysis
Statistical analysis
Summing up

Lecture 2 Methods and Procedures of Lexicological Research

1. Lviv National University named after Ivan Franko

Lviv National University
named after Ivan Franko
Department of translation studies and contrastive
linguistics named after Hryhoriy Kochur
Nadiya Andreichuk, associate-professor
[email protected]

2. Lecture 2

Methods and
Procedures of
Lexicological Research
Contrast is the occurance
of different elements
to create interest

3. Plan

1. Scientific standard – theory – method
2. Stages of linguistic investigation
Methods of analysis used in lexicology
3.1. The IC analysis
3.2. Distributional analysis
3.3. Transformational procedures
3.4. Componental analysis
4. Statistical methods of analysis
5. Contrastive analysis

4. Scientific standard – theory – method

Scientific standard – theory – method
Linguistics as an empirical science is
supposed to be based on the following
minimal standards:
The inquiry must deal with perceivable data of a
certain phenomenon.
The statement about the phenomenon (hypotheses)
must be objective.
The statements must be logical and coherent
They must be systematically ordered (in addition to
Statements must be formulated so that they can be
proven wrong or inadequate (if they are).

5. theory and model

A theory is a system of hypotheses for describing
and/or explaining a certain area of objects. Each
theory must satisfy certain requirements, such as
consistency, completeness, adequacy, simplicity.
A model is a formal representation of the structural
and functional characteristics of an object of study.
Models are used:
to explain a theory,
to simulate a process of research
to illustrate the functioning of an object of study.

6. induction

a scientist must begin by collecting observations or
produce data by experiments
After he has made a large and sufficient number of
such observations or experiments, he proceeds to a
generalization about these data. This
generalization is expected to be supported by the
original data. After several attempts at generalizing
he may proceed to a new (modified) hypothesis by
looking at new data. The modified hypothesis
should cover the old original data and the new
data. This way of arriving at hypotheses is
called inductive.

7. deduction

The scientist has some ideas, some knowledge, or
may be interested in a problem (including some
knowledge) as the input to his theory-construction.
How he comes to that knowledge is of no theoretical
consequence or importance. The scientist then
formulates a first working hypothesis as a tentative
answer to his problem. A good hypothesis is based
on common scientific standards. It is then tested
against a collection of observations or experimental
data and might be modified on this data basis. This
way is called deductive insofar as it assumes that
the hypothesis is derived (deduced) from already
existent knowledge and then tested by empirical data

8. ways of approaching data

9. Observation

the process of scientific investigation in the field of
lexicology may be subdivided into several stages.
Observation is an early and basic phase − the
centre of what is called the inductive method of
inquiry. All factual statements, i.e. statements
capable of objective verification, are based on

10. classification

orderly arrangement of the data obtained through
e.g. it is observed that in English nouns the suffix
morpheme -er is added to verbal stems (speak + -er,
teach + -er etc.) and noun stems (village + -er,
London + -er, etc.), and that -er also occurs in such
non-derived words as mother, father etc.
Accordingly, all the nouns in -er may be classified
into two types − derived and simple words, and the
derived words may be subdivided into two groups
according to their stems.

11. generalization

collection of data and their orderly
arrangement must eventually lead to
formulation of a hypothesis, rule or law. In
our case, we can formulate a rule that
derived nouns in -er may have either verbal
or noun stem. The suffix -er in combination
with adjectival or adverbial stems cannot
form nouns (to dig -digger but big - bigger).

12. verifying process

One of the fundamental tests of the validity of a
generalization is whether or not this generalization
is useful in making reliable predictions. We may
predict, if we make use of the statistical analysis,
that deverbal nouns with the suffix -er are more
likely to be coined than the other types of nouns
with this suffix.
Any linguistic generalization is to be followed by a
verifying process where various procedures of
linguistic analysis are applied. Selection of this or
that particular procedure depends on the goal set
before the investigator.

13. Methods of analysis used in lexicology

The research methods applied in lexicology have
always been closely connected with the
paradigms of linguistic research.
The development of structuralist ideas gave birth
to the structural method which includes several
types of analysis.

14. Immediate constituents analysis

Immediate constituents analysis
The theory of Immediate Constituents (IC) was
originally elaborated as an attempt to determine the
ways in which lexical units are relevantly related to
one another .
It was discovered that combinations of units are
usually structured into hierarchial sets of binary
The fundamental aim of IC analysis is to segment a
set of lexical units into two maximally independent
sequences or ICs thus revealing the hierarchical
structure of this set.

15. Immediate constituents analysis

Successive segmentation results in ultimate
constituents (UC). i.e. two-facet units that cannot be
segmented into smaller units having both soundform and meaning. The procedure was first
suggested by L. Bloomfield [„Languageˮ] and was
later developed by E. Nida [„Morphology. The
Descriptive Analysis of Wordsˮ].

16. Immediate constituents analysis

A sample analysis which has become almost
classical, being repeated many times by many
authors, is Bloomfield`s analysis of the word
ungentlemanly. Comparing the word with the
other utterances the listener recognises the
morpheme un- as a negative prefix because he has
often come across words built on the pattern un- +
adjective stem: uncertain, unconscious, uneasy,
unfortunate, unmistakable, unnatural. One can also
come across the adjective gentlemanly.

17. Immediate constituents analysis

Thus at the first cut we obtain the following
immediate constituents: un – + gentlemanly.
If we continue our analysis we see that although
gent occurs as a free form in low colloquial usage,
no such words as lemanly may be found either as a
free or as a bound constituent, so this time we have
to separate the final morpheme. We are justified in
so doing as there are many adjectives following the
pattern noun stem + -ly, such as womanly
masterly, scholarly, soldierly with the same
semantic relationship of „having the quality of the
person denoted by the stem”; we also have come
across the noun gentleman in other utterances.

18. Immediate constituents analysis

The two first stages of the analysis resulted in
separating a free and a bound form:
1) un- + gentlemanly, 2) gentleman + -ly.
The third cut has its peculiarities. The devision into
gent- + -leman is obviously impossible as no such
patterns exist in English, so the cut is gentle +
man. A similar pattern is observed in nobleman,
and so we state adjective stem + -man.
The word gentle is open to discussion. If we
compare it with such adjectives as brittle, fertile,
juvenile, little, noble, subtle and some more
containing the suffix -le/-ile added to a bound stem,
they form a pattern for our case.

19. Immediate constituents analysis

To sum up: as we break the word we obtain at any
level only two ICs, one of which is the stem of the
given word. All the time the analysis is based on the
patterns characteristic of the English vocabulary. As
a pattern showing the interdependence of all the
constituents segregated at various stages we obtain
the following formula:
un- + {[(gent- + -le) + -man] + -ly}
This method of analysis is extremely fruitful in
discovering the derivational structure of words.

20. Distributional analysis

In linguistics, distributional
analysis is most commonly
associated with the name of
Zelig Harris.

21. Z. Harris

Many problems with his approach are well known:
the neglect of semantic aspects of language and the
lack of explicitness of his procedures,
syntactic rules manifest themselves in
distributional patterns,
distributional analysis is the central concern of

22. Z. Harris

In his theory, the analysis of distributional data
consists of two steps: „the setting up of elements,
and the statement of the distribution of these
elements relative to each otherˮ.
segments of repetitions of an identical utterance
are called free variants of each other. More
complicated entities such as phonemes,
morphemes or syntactic categories are then built
up from the initial elements by searching for
groups of elements that do not contrast with each
other, because they have identical distributions

23. Z. Harris

For example,
a and an can be grouped into one class
„indefinite articleˮ since they occur in
disjoined environments;
our and his can be grouped into one class
„ possessive pronoun” since they occur in
identical environments.

24. Distributional analysis

in its various forms is commonly used nowadays
and is treated on the level of not only formal but also
semantic classes or subclasses of lexical units.
In other words, by this term we understand the
position which lexical unit occupies or may occupy
in the text or in the flow of speech.
Distributional analysis in lexicology aims to study
lexical units in terms of their distribution, i.e. the
immediate environment in the flow of speech.

25. Distributional analysis

It is assumed that the meaning of any lexical unit
may be viewed as made up by the lexical meaning of
its components and by the meaning of the pattern of
their arrangement, i.e. their distributional meaning.
This may perhaps be illustrated by the semantic
analysis of polymorphic words. The word singer, e.g.,
has the meaning of ‘one who sings or is singing’ not
only due to the lexical meaning of the stem sing- and
the derivational morpheme -er (= active doer), but
also because of the meaning of their distributional
pattern. A different pattern of arrangement of the
same morphemes *ersing changes the whole into a
meaningless string of sounds.

26. Distributional analysis

Thus it can be observed that in a number of cases
words have different lexical meanings in different
distributional patterns. e.g., the lexical meaning of
the verb to treat in the following:
to treat somebody well, kindly, etc. — ‘to act or
behave towards’ where the verb is followed by
a noun + an adverb
to treat somebody to ice-cream, champagne, etc. —
‘to supply with food, drink, entertainment, etc. at
one’s own expence’ where the verb is followed by
a noun+the preposition to + another noun.

27. Distributional analysis

Compare also the meaning of the adjective ill in
different distributional structures, e.g. ill look, ill
luck, ill health, etc. (ill+N — ‘bad’) and fall ill, be
ill, etc. (V+ill — ’sick’).
The interdependence of distribution and meaning
can be also observed at the level of word-groups. It
is only the distribution of otherwise completely
identical lexical units that accounts for the
difference in the meaning of water tap and tap
water. Thus, as far as words are concerned the
meaning by distribution may be defined as an
abstraction on the syntagmatic level.

28. Distributional analysis

not only words in word-groups but also whole word-
groups may acquire a certain denotational meaning
due to certain distributional pattern to which this
particular meaning is habitually attached.
For example, habitually the word preceding ago
denotes a certain period of time (an hour, a month, a
century, etc. ago) and the whole word-group denotes
a certain temporal unit. In this particular
distributional pattern any word is bound to acquire
an additional lexical meaning of a certain period of
time, e.g. a grief ago (E. Cummings), three
cigarettes ago (A. Christie), etc.

29. Distributional analysis

Distributional analysis
Distributional pattern as such seems to possess a
component of meaning not to be found in individual
words making up the word-group or the sentence.
Thus, the meaning ‘make somebody do smth by
means of something’ cannot be traced back to the
lexical meanings of the individual words in ‘to coax
somebody into accepting the suggestion’.

30. Distributional analysis

The distributional pattern itself seems to impart this
meaning to the whole irrespective of the meaning of
the verb used in this structure, i.e. in the pattern
V+N+into+Ving verbs of widely different lexical
meaning may be used. One can say, e.g., to kiss
somebody into doing smth, to flatter somebody into
doing smth, to beat somebody into doing something,
etc.; in all these word-groups one finds the meaning
‘to make somebody do something’ which is actually
imparted by the distributional pattern.

31. Distributional analysis

To conclude, distribution defined as the occurrence
of a lexical unit relative to other lexical units can be
interpreted as co-occurrence of lexical items and the
two terms can be viewed as synonyms.
It follows that by the term distribution we
understand the aptness of a word in one of
its meanings to collocate or to co-occur with
a certain group, or certain groups of words
having some common semantic component.

32. Transformational analysis

repatterning of various distributional
structures in order to discover difference or
sameness of meaning of practically identical
distributional patterns .
As distributional patterns are in a number of cases
polysemantic, transformational procedures are of
help not only in the analysis of semantic sameness /
difference of the lexical units but also in the analysis
of the factors that account for their polysemy . Wordgroups of identical distributional structure when
repatterned show that the semantic relations
between words and consequently the meaning may
be different .

33. Transformational analysis

e. g. A pattern "possessive pronoun "+"noun"(his
car , his failure , his arrest, his kindness )
According to transformational analysis the
meaning of each word-group may be represented
as :
he has a car, he failed, he was arrested, he is kind
In each of the cases different meaning is revealed :
possession , action , passive action , quality

34. Transformational analysis

The rules of transformation are rather strict
and shouldn't be identified with
paraphrasing in the usual sense of the term
There are many restrictions both on
syntactic and lexical levels

35. Transformational analysis

I.Permutation - the repatterning on condition that
the basic subordinative relationships between words
and word-stems of the lexical units are not changed
e. g. "His work is excellent " may be transformed
into " his excellent work , the excellence of his work ,
he works excellently “
In the example given the relationships between
lexical units and the stems of the notional words are
essentially the same .

36. Transformational analysis

II. Replacement - the substitution of a component
of the distributional structure by a member of a
certain strictly defined set of lexical units.
e. g. Replacement of a notional verb by an auxiliary
or link verb (he will make a bad mistake and he will
make a good teacher ). The sentences have identical
distributional structure but only in the second one
the verb "to make " can be substituted by " become "
or " be ".
The fact of impossibility of identical transformations
of distributionally identical structures is a formal
proof of the difference in their meaning.

37. Transformational analysis

III. Addition (or expansion) may be illustrated
by the application of the procedure of addition to the
classification of adjectives into two groups-adjectives
denoting inherent and non-inherent qualities.
e. g.
John is happy.
John is tall.
If we add a word-group in Moscow, we shall see that
"John is happy in Moscow” has meaning while the
second one is senseless .
That is accounted by the difference in the meaning of
adjectives denoting inherent (tall) and non-inherent
(happy) qualities .

38. Transformational analysis

IV. Deletion - a procedure which shows whether
one of the words semantically subordinated to the
other, e. g. the word-group "red flowers " may be
deleted and transformed into "flowers " without
making the sentence senseless : I like red flowers or
I like flowers.
The other word-group "red tape " can't be deleted
and transformed either into " I hate tape " or "I hate
red " because in both transformed sentences the
meaning of the phrase "red tape" means
"bureaucracy" and it can't be divided into two parts.

39. Componental analysis

an attempt to describe the meaning of
words in terms of a universal inventory of
semantic components and their possible
In this analysis linguists proceed from the
assumption that the smallest units of meaning are
sememes or semes .

40. Componental analysis

e. g. In the lexical item "woman" several sememes
may be singled out , such as human , not an
animal, female , adult.
The analysis of the word "girl" will show the
following sememes : human , female , young.
The last component of the two words differentiates
them and makes impossible to mix up the words in
the process of communication. It is classical form
of revealing the work of componental analysis to
apply them to the so called closed systems of
vocabulary , for example , colour terms .

41. Componental analysis

Componental analysis is practically always combined
with transformational procedures or statistical
analysis .The combination makes it possible to find
out which of the meanings should be represented
first of all in the dictionaries of different types and
how the words should be combined in order to make
your speech sensible

42. Statistical analysis

Modern structural ways of analysis are often
combined with statistical procedures making the
whole approach more rigorous.
Statistics describes how things are on the average.
For a modern linguist it is not enough to know that it
is allowable for a given structure to appear, he is
interested in its frequency, in how often it appears.
It is, however, naive to think that a mere attachment
of numbers confers rigour on an argument, that
giving percentage or adopting mathematical
terminology automatically makes the study "exact",
"objective", "scientific".

43. Statistical analysis

Computation is useful only if it follows certain rules
of mathematical statistics;
the scholar must be able to state his margin of error
and to say in what relationship his data stand to the
whole body of similar language phenomena.
There has been a considerable growth of interest and
activity in statistical linguistics in the last decades.
Statistical approach is most helpful when we have
large masses of data to analyse, and this is precisely
the case with vocabulary study.

44. Statistical analysis

A single observation may not be reliable, whereas a
correctly executed statistical study shows trends, the
most typical properties and correlations. It is true
that some details are lost because statistical study is
necessarily simplifying and abstract. But a general
orientation may be gained, provided that the units
for analysis are well chosen and sufficiently defined
and that the factors we decide to take into
consideration (or disregard) correspond to the
purposes of the study

45. Statistical analysis

Probably the best known result so far achieved in the
field of statistical linguistics is the formula known as
Zipf’s law. The formula states essentially that if the
words in a long text are ranked in order of decreasing
frequency of occurrence in the text, so that the most
frequent word has the rank r=1, the next frequent
has the rank r=2, and so forth, then the product of
the rank r for any word in the text will be
approximately the same constant c, where c depends
on the length of the text. His other formula suggests
that the number of meanings in any polysemantic
word is proportional to the square root of its relative

46. Statistical analysis

Successful efforts have been made to apply certain
statistical techniques to the study of problems in
historical lexicology. The statistical study of
vocabulary is gaining more impetus every year.
Counts are made for the vocabularies of great writers
and average speakers. One of the most prominent
representatives of statistical linguistics Pierre
Guiraud has estimated that the "passive" vocabulary
of an average educated person comprises about
20, 000 words.

47. Summing up

no procedures or techniques exist that may serve as
panaceas [ˌpænə'sɪə] for all the difficult problems —
the method of investigation should be always chosen
or evolved according to the particular task with
which the investigator is confronted.
All methods aim at being impersonal and objective in
the sense that they must lead to verifiable
generalizations. In this effort to find verifiable
relationships concerning typical contrastive shapes
and arrangements of linguistic elements, functioning
in a system, the study of vocabulary has turned away
from chance observation and made considerable
scientific progress.
English     Русский Rules