שׂבולת שׂמית

Stream-of-consciousness thoughts about why we say “Semitic” even though the root is “Shem”. And, yes, I know the Hebrew letters in the title say “semitic sibboleth” and not “shemitic shibboleth”.

In my youth I was exposed to many years of Greek and Latin, and a concomitant of pubescence was my learning that the ancient Greek sound inventory did not include the phoneme /sh/.

“It is for the lack of Sh in Greek”, revealed our teacher, “that we say See-mite, and not Shee-mite, which might incite unintended entomological afterthoughts”. Or at least that’s how I remember the observation. Maybe he was referring to the widow’s mite, for all I remember now.

Even though the root is Shem (שׁם), the name of Noah’s eldest, we say “Semitic” just because we got the word through Greek and the Hebrew name had to be transcribed in Greek as Sem (Σημ). By contrast, in Modern Hebrew, the Israelis do say “SH[emitic]” and “[anti]SH[emitism]”. [1]

( To the sound value /sh/ I henceforth refer, from a misguided sense of linguistic united-nationism, by the symbol /ʃ/ of the International Phonetic Alphabet, also known as the “voiceless palato-alveolar sibilant“. )

Technically, Greeks can’t tell the difference between /s/ and /ʃ/, for these are “allophonic” in Greek. So to this day Greeks say pasas for the Turkish pasha or sakis (σαχής) for the Persian shah, a variety of monarch with which Greeks have been long familiar. Thus the famous Persian emperors Kurosh and Darayavahush are mostly known as Cyrus and Darius. (For the latter it’s a serious improvement.)

The distinction between /s/ and /ʃ/ is famously exploited in the Bible :

“Then said they unto him, Say now Shibboleth: and he said Sibboleth: for he could not frame to pronounce it right. Then they took him, and slew him at the passages of Jordan: and there fell at that time of the Ephraimites forty and two thousand.” [Judges 12:6, King James Version]

The Ephraimites were one of the Tribes of Israel and thus Hebrew-speaking. But Ephraim, son of Joseph, was half-Egyptian, was born in Egypt and grew up there. So the Ephraimites had lost the ability to distinguish between /s/ and /ʃ/ — at least that’s the story.

( The odd effect of being Egyptian has continued to the present day, linguistically. The first name of Nasser, the nationalist leader of Egypt 1954-70, was جمال which is pronounced Jamal in most other Arab countries, but not in Egypt, where the letter ج is rendered like the hard /g/ as in Gamal Abdel Nasser. ) [2]

When I read Judges 12:6, I had my first doubts about the original explanation for why we say Semitic instead of Shemitic. After all, the Hebrew Bible had been translated into Greek 2300 years ago and the story of shibboleth had to be told somehow. Since Biblical Greek was completely disdained at my school, I actually had no idea how “shibboleth” was rendered in the Septuagint (the Greek translation of the Hebrew Bible, completed in Alexandria around 200 BC). So I looked it up, and what a disappointment ! The Greek translation completely avoids the issue by not mentioning “shibboleth” at all. But where ever “Shem” is mentioned, it is transcribed as “Sem”.

But my doubts about the origin of “Semitic” in English would periodically resurface over the years. What does it say in the Vulgate, the Latin translation of the Bible by St Jerome ? It actually distinguishes between Scibboleth and Sibboleth, although this could be because late Vulgar Latin might also have distinguished between /s/ and /ʃ/. Certainly Italian does today. But the Vulgate renders Shlomo as Solomon and Ishmael as Ismael, so the /sc/ in the passage from Judges was probably ad hoc, to actually tell the tale, unlike the Septuagint.

And the name of Noah’s son is Saam (سام ) in Arabic, Hebrew’s cousin language, even though Arabic does have a letter representing the /ʃ/ phoneme ( ش ). I am assured, however, by Semitic historical linguistics that thanks to the evolution of Semitic languages the Arabic /s/ corresponds to Hebrew /ʃ/ in cognates, e.g., SLM which is salam in Arabic but sholom in Hebrew.

Then I noticed that word “Shemitic” appeared fairly often in books from the late 18th century, probably thanks to Protestants’ sense of Biblical accuracy. So I entertained the theory that the German polymath August Ludwig von Schlözer who actually invented the category called “Semitic languages”, had confused Shem and Sem. Therefore we live with an error of German Orientalist scholarship. I found that plausible because the difference between the Hebrew letters sin /s/ and shin /ʃ/ amounts to a dot :


(And you can see how the Russian Ш /ʃ/ and the Greek letter sigma ∑ are all related. ) The two glyphs are identical but for the left-right difference in the placement of the dot above. And frequently the dot is not even written down ! “Those damned dots”, as Churchill’s father used to refer to decimal points when he was Chancellor of the Exchequer…

There are other instances of “misreading” begetting a life of its own. In Turkish, the word Ottoman is Osmanli. Osman is the Turkish variant of the Arabic name Uthman عثمان ), the name of the third Caliph of Islam. Turks can’t lisp, so they said Usman or Osman. The English word Ottoman comes from an Italian corruption of the Arabic original. Like the Turks, the Italians can’t lisp, but unlike the Turks the Italians interpret / θ / as / t /. Thus, Ottoman. The Germans remain faithful to the Turks : Osmanisches Reich.

Then recently I read In the Beginning : A Short History of the Hebrew Language, which despite the title spends an inordinate amount of time evaluating the work of the Masoretic scribes in great detail. These were the people who, in a fit of crazed diligence, feverishly marked up the Hebrew script, like this bit :


All those dots, lines, and squiggles are pronunciation guides for readers to ensure proper reading of scripture. Those are necessary because Hebrew, ancient and modern, is written with key pronunciation clues totally missing.

Semitic languages like Hebrew and Arabic lack letters for short vowels, and the 3 (or 3½) long vowels that do get written down nonetheless lead double lives as….consonants, and you aren’t told which is which and when. Five consonant pairs, including S/Sh and B/V, are distinguished only by dots, but (in Hebrew though not in Arabic) the dots are normally not written ! Doubled consonants are also unmarked — as well as a bunch of other inconveniences which make the writing system vaguely hieroglyphic, from the point of view of a speaker of modern European languages.

Thus, prospective learners of Semitic languages like Arabic and Hebrew might have difficulty knowing how a string of letters is supposed to sound without looking it up in a dictionary. Modern Israeli newspapers totally dispense with those damned dots, lines and squiggles. If English were transcribed along unpointed Semitic principles, then the sentence “ripples in the sea show where a ship had passed near the boat” might be rendered :


So the Masoretic scribes, in order to banish forever any possibility of mispronunciation of scripture, decided to mark everything and I mean everything — vowels, consonants, doubled letters, sentence stops (there are no punctuation marks in the original Bible), phrase separations, etc.

The trouble is, those scribes lived anywhere between 1000 and 200o years after the various parts of the Hebrew Bible had been composed. The Hebrew they spoke was definitely not ancient Hebrew. Also, there were many different Masoretic traditions in different places, such as Palestine, Mesopotamia, etc., and they do not agree completely on the pronunciations.

Yet one of those traditions, called Tiberian Masoretic, established today’s standard of pronunciation for Biblical Hebrew. And their work is also the basis of the pronunciation of modern Israeli Hebrew. And they also established the “official” text of the Hebrew Bible, on which all post-1500 translations of the Hebrew Bible have been based. Babylonian Jews, the Alexandrian Jewish translators of the Septuagint, Jesus, the Apostles, the early Church fathers, and St Jerome who did the Vulgate, had all read different pre-Masoretic texts, possibly quite different in some places, and these survive only in sorry fragments.

Thus, on those Tiberian shoulders much does hang. For example, the very first phrase of the Hebrew Bible, “In the beginning” from Genesis 1:1, was originally written simply as a string of 6 letters, VRASYT (בראשית), and nothing more, and it was not even separated from the subsequent words. By adding 8 markers, the Tiberian Masoretes turned this into a separate and discrete b’reshit, which is the basis for all subsequent interpretations of the phrase, both rabbinical and secular.

On the markers devised by the Tiberian Masoretes, Joel Hoffman of the above mentioned book is quite clear : while the Tiberians did not simply project their own speech, they also certainly did not recapture the sounds of Ancient Biblical Hebrew. Hoffman goes into considerable depth to convincingly establish that point.

The Tiberian Masoretes mostly agree on the consonants with the Alexandrian Jewish translators of the Septuagint, but there are numerous discordances on vowels and sometimes on key values for consonants. It’s entirely possible that the Greek versions Abraham and Eua (Eve) are more accurate in representing ancient Hebrew than the “official” versions Avram and Havah. And Rebecca may actually be closer to the original than Rivkah. We don’t know for sure.

How does that illuminate the question of Semitic vs Shemitic ? Although Hoffman does not specifically address that question, he does give numerous instances of Masoretic interpretations not being supported by Greek transliterations of the Hebrew in the 3rd century BCE. That doesn’t mean the Masoretes were wrong, but we also can’t assume they were right.

So maybe, just maybe, the Masoretes got the sin/shin distinction in Hebrew wrong. In the original Hebrew of Judges, the distinction is between “shibboleth” spelt with shin/sin (ש) and “shibboleth” spelt with samekh (ס), another letter representing the sound /s/. It’s entirely possible that the /ʃ/ sound didn’t exist in Hebrew by the time Judges was composed, and our primary (only?) attestation is the Masoretes. Just a few centuries ago neither English nor German possessed the /ʃ/ sound, even though neither language can be imagined without it now. A candidate for what sin/shin (ש) might have been is the voiceless alveolar lateral fricative, which Greek probably would also have rendered /s/.

But the preceding paragraph could be utterly, completely and irredeemably inconsistent with some well-established part of Semitic historical linguistics. I’m not sure and I have to delve into it more deeply…

Edits :

[1] I’ve been told now Israelis do not in practise say “anti-shemitic”. I only went by the location of the dot over ש in the dictionary. Edit : But apparently I was right the first time !

[2] The statement about Nasser’s first name has been criticised. The point here was a joke exploiting the fact that the Egyptian colloquial pronunciation of the Arabic letter ج is considered deviant by most of the Arab world, just as the Ephraimite pronunciation of “s(h)ibboleth” was considered nonstandard in Judges. Most of the rest of the Arab world realise the letter either as /d͡ʒ/, /ʒ/ or /j/. Originally  ج was /g/ or /gʲ/, which implies that Egyptian is more conservative and the other dialects, including Modern Standard Arabic, are the innovative ones. But that doesn’t change the social perception of Egyptian colloquial !

This entry was posted in Ancient Greek, Biblical Hebrew, Languages and tagged , , , , , , . Bookmark the permalink.

14 Responses to שׂבולת שׂמית

  1. Whyvert says:

    Seeing that marked-up Hebrew reminds me that the phrase “every jot and tittle” is from the KJV Bible. Seems apposite.


  2. Andonly says:

    I quite enjoyed your post on Ivrit. Although I can’t say I’ve ever heard a Hebrew speaker say “antiSHemite.” Also, Eve is not contemporarily pronounced Hava, but rather, Chava. (Easy enough to get there.)

    It’s just that the heh in Havah has now become a chet, typically rendered CH. No one I’ve run across pronounces it Hava or spells it with a heh. (I know a few Chavas.)

    (BTW, in modern Hebrew there’s no difference between the pronunciation of a chet and a chaf–which is a kaf without a dot–but it really would help, IMO, if we had a consistent convention in English transliteration for distinguishing between the two without resorting to diacritics. Like maybe chaf should always be ch and chet, kh.)


  3. On “antishemite”, I just went by the location of the dot in the dictionary. On Eve, I went with the Biblical Hebrew.

    ח rendered /x/ is originally Ashkenazi-Yiddish. My frame of reference will always be classical since it’s friendlier with all the pointings !

    BTW, in modern Hebrew there’s no difference between the pronunciation of a chet and a chaf–which is a kaf without a dot–but it really would help, IMO, if we had a consistent convention in English transliteration for distinguishing between the two without resorting to diacritics. Like maybe chaf should always be ch and chet, kh

    I say, transliterate ח as /h/ in obeisance to the original, and כ as /kh/.

    /ch/ just looks so pale-ghettoey.


  4. Andonly says:

    ח rendered /x/ is originally Ashkenazi-Yiddish”

    Really? Was it indistinguishable from heh or was it like one of those Arabic halts of breath?

    Wait, it must have been. There’s a Hebrew singer, Mosh ben Ari, who performs a traditional song called Im Ninalu in which he occasionally pronounces “chai” with an Arabic-sounding asphyxiated emphasis…

    I say, transliterate ח as /h/ in obeisance to the original, and כ as /kh/.

    Nah, you can’t have miles and miles of /h/ in liturgical texts. It just fucks up the aesthetics of the typeface too badly.


  5. ה = originally the “normal” (glottal) / h /

    ח = the “Arab” (pharyngeal) H written in IPA as / ħ /

    Of course Arabic still has both sounds. I guess the Arab-origin Jews had / ħ / but that must have largely disappeared in Israel today.


  6. Anonymous says:

    You were correct originally, pseudoerasmus, in Hebrew, anti-semitism IS in fact “anti-shemiut”. (Anti pronounced more like “un-tee”). And the word for anti-semite is “antishemi”. Never heard a Hebrew speaker pronounce it otherwise


  7. Andonly says:

    Well, Anonymous is right about antishemiut/antishemi–I didn’t think to concede the less literal point. But what I meant was that Hebrew speakers don’t modify English per Pseudoerasmus’ remark that “Israelis do say “shemitic” and “anti-shemitism”.


  8. myb6 says:

    Interesting article. It’s not particularly relevant, but some of the “dots, lines, and squiggles” are trope, meaning they’re tonal guides for chanting, a kind of musical notation. I always found that fascinating.


  9. Enrico says:

    Reblogged this on Epenthesis and commented:
    Sounds complicated, right? That’s one of the reasons why I always wanted to learn Hebrew.


  10. j says:

    Enjoyed the article.


  11. j says:

    / ħ / is not disappearing fast enough. I am Hungarian and cant pronounce it and get corrected all the time by the נודניקים


  12. friendship says:

    new to your blog, don’t know if this has been mentioned: RE: footnote 2: the same situation occurs with speakers of the various Chineses, with “backwards” or “hick” pronunciations being more archaic, more reflective of a distant past. Standard Mandarin (putonghua) is probably the most innovative among all dialects, and much of the best poetry (pre-1300s) lose rhyme when recited in it — a problem southern schoolchildren can work around by using their “home” dialects to read the poems.


  13. I think that’s a common occurrence with the cosmopolitan standard dialect of many languages. It’s usually full of innovations introduced by new speakers, etc. Some of the most archaic Iranic languages are in the most backward places in Afghanistan, Tajikistan, Kurdistan, and the Caucasus.


  14. David Stern says:

    That’s the “official” pronunciation. Even some newsreaders use it. Ashkenazim don’t but many Sepharadim do….
    ח = the “Arab” (pharyngeal) H written in IPA as / ħ /
    On Abraham, the Masoretic pronunciation is Avraham. But the linguistic consensus is that the 6 letters which vary in sound according to whether they have a dagesh or not: BeGeD KeFeT (only 3 in modern Israeli Hebrew) only began to be pronounced in this way fairly late. Shin/Sin is a remnant of the earlier distinction between Shin/Sin/Thin, Khet/Het, and ‘Ayin/Gayin which is evident in the many words that are spelt the same today but have different meanings. All these distinctions are preserved in Arabic.

    Recently I read about the name we pronounce “Samson” in English and “Shimshon” in Modern Hebrew. In the Septuagint it is Sampson with a psi. First it seems the first vowel changed between 300BCE and 600CE and originally maybe the two shins were pronounced differently. with the sin represented by a psi in Greek. Shimshon is derived from Shemesh = sun – he lived in the area of Beit Shemesh. In Arabic sun is شمس (shin-mem-sin = shams) and so probably in Hebrew it once was too… and in deep history was pronounced shamsu.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s