Book of Mormon wordprint studies have been used by both critics and Apologists to provide valuable insights into the text of the BOM. The purpose of this post is to reintroduce my wordprint study, to review various important wordprint studies of the Book of Mormon of the past, and to evaluate where we stand in this field in 2020.
2016 Three Voice N-S-L Model
I created the Three Voice N-S-L Model in 2016. Like many other Book of Mormon readers, I noticed two distinct styles in the Book of Mormon. The historical narrative from Mormon and the quoted sermons that Mormon and other narrators splice into the text seemed to me to have a significantly different style.
I applied the BYU Book of Mormon voices database to the text of the Book of Mormon to separate the text into Narrative and Sermon and then performed database queries and Excel modeling on this. This resulted in identifying many distinctive attributes of the narrative section “N Voice” and sermons section “S Voice”.
The N Voice is in past tense, usually passive like “when they had been cast into prison”. S Voice is in first person and usually present tense. “I perceive that ye are”. N Voice is where you get the phrases “came to pass”, “round about”, “among the people”, “contend with”. S Voice common phrases are “ye shall”, “I say unto you”, “for behold”. N Voice uses “God” for name of deity usually. S Voice uses many names but most frequently Lord and Lord God. When you get the 19th century sounding phrases like “demands of justice”, “song of redeeming love”, “reason to rejoice” it’s always the S Voice. There is even significantly different usage of basic words like “if”, “the”, “and” between the two voices.
This chart shows selected vocabulary and phrases and the counts in N and S Voice text groups. N has total count of 119K words. S has total count of 130K words.
Here is a breakout chapter by chapter red is N voice, blue is S voice. You can use the Book of Mormon voices database to break this out, or you can simply go through the text and chapter by chapter or sometimes verse by verse flip it to S or N based on who is talking.
Voices for Mormon, Moroni, Nephi, and Jacob are split between N and S Voice. It’s very rare and obvious when Mormon breaks from third person narrative to first person sermon. For example in Mosiah 23, Mormon is narrating up through v. 20, then stops and addresses his audience in first person in v 21-23 then back to narrrative in v. 24. In this selection, v 20 and 24 are considered N Voice. v 21-23 are considered S Voice. It’s obvious when Mormon does this, less obvious when Moroni does it, then for Nephi, it’s not obvious at all. I make the distinction in a somewhat arbitrary manner, dividing the text into N and S based on whether the intent is to tell a narrative or to insert sermon style preaching.
20 And it came to pass that they did multiply and prosper exceedingly in the land of Helam; and they built a city, which they called the city of Helam.
21 Nevertheless the Lord seeth fit to his people; yea, he trieth their and their faith.
22 Nevertheless—whosoever putteth his in him the same shall be at the last day. Yea, and thus it was with this people.
23 For behold, I will show unto you that they were brought into , and none could deliver them but the Lord their God, yea, even the God of Abraham and Isaac and of Jacob.
24 And it came to pass that he did deliver them, and he did show forth his mighty power unto them, and great were their rejoicings.
Although the S Voice is a conglomeration of Alma, King Benjamin, Samuel the Lamanite, etc, this is a very consistent voice with consistent trends. For example, see three S Voice speakers all using very similar sounding language.
Mosiah 15:9 (Abinadi speaking)
Having ascended into heaven, having the bowels of mercy; being filled with compassion towards the children of men; standing betwixt them and justice; having broken the bands of death, taken upon himself their iniquity and their transgressions, having redeemed them, and satisfied the demands of justice.
Alma 34:16 (Amulek speaking)
And thus mercy can satisfy the demands of justice, and encircles them in the arms of safety, while he that exercises no faith unto repentance is exposed to the whole law of the demands of justice; therefore only unto him that has faith unto repentance is brought about the great and eternal plan of redemption.
Alma 42:15 (Alma speaking)
And now, the plan of mercy could not be brought about except an atonement should be made; therefore God himself atoneth for the sins of the world, to bring about the plan of mercy, to appease the demands of justice, that God might be a perfect, just God, and a merciful God also.
These should all be considered one voice, the BOM Sermon Voice, not as three distinct voices.
The N Voice combines multiple voices like Zeniff, Mormon, Helaman, and Moroni, but this also is a very consistent voice with consistent vocabulary and grammar trends.
Mosiah 9:11 (Zeniff narrating)
Therefore it came to pass, that after we had dwelt in the land for the space of twelve years that king Laman began to grow uneasy, lest by any means my people should wax strong in the land, and that they could not overpower them and bring them into bondage.
Alma 62:48 (Mormon narrating)
And the people of Nephi began to prosper again in the land, and began to multiply and to wax exceedingly strong again in the land. And they began to grow exceedingly rich.
Ether 7:19 (Moroni narrating–also includes the L Voice “did x” see below)
Wherefore, the son of Noah did build up his kingdom in his stead; nevertheless they did not gain power any more over Shule the king, and the people who were under the reign of Shule the king did prosper exceedingly and wax great.
Again, these should be viewed as one voice, the BOM Narrator Voice, not as three distinct voices.
While analyzing the nuances of N vs S, one more trend jumped out at me. The really clean distinction between N and S is much stronger in the first half of the book than the last (by Mosiah priority). I also started seeing distinct vocabulary, phrase, grammar, and theme trending from start to end. I call this voice that seems to creep in starting the end of Alma and get stronger through Moroni and then into Nephi the Late Voice “L”. As I mention in more detail later in this article, Brent Metcalfe was the first to notice this L Voice trend. It could be considered “monotonic stylistic drift” which is common in large works where an author subtly changes his or her style over the time spent writing the work. I believe it’s a substantial enough trend that I call it out as a unique voice.
Characteristics of the Late Voice “L”:
- increase of narrator speaking in first person (Mormon rarely does this from Mosiah to Alma but then starts to to it more often, Moroni does it frequently, and Nephi does it exclusively)
- more fluidly changing from N vs S, compared to primarily large blocks at a time in the first half
- increase in vocabulary and topics related to House of Israel, Jews, Gentiles, etc.
- increase in certain phrases: “by the power”, “face of the land”, “like unto”
- increase in vocabulary and topics related to the Book of Mormon writing process itself: “plates”, “writing”, etc.
- decrease in using some of the peculiar words and phrases strong in the first half: “thus”, “took”, “over the people”
- change in usage of common words and prepositions
Some charts that show these examples:
One of the most striking examples of the L Voice is the prominence of “House of Israel” themes. Especially related to BOM writers speaking to or about future Jews and Gentiles. This chart represents frequencies for the following words/phrases:
Gentile, Gentiles, house of Jacob, Israel, Jew, Jews, Judah, tribe, tribes, Zion
NOTE: when I display the three blue bar charts together, this is showing frequency per blocks of text in 2,000 chunks. Top chart is entire BOM. Middle chart is isolating S Voice. Bottom chart is isolating N Voice.
The Book of Mormon as a topic suddenly starts becoming a major theme starting the beginning of 3rd Nephi. A lot of references to what the authors feel compelled to write. References to the plates. Addressing the audience and how the Book of Mormon will be received.
This is a very interesting one, especially related to “Hebraisms” in the text. Grammar clauses using the word “or” have been argued as evidence of Hebraisms in the Book of Mormon, ie Hebrew forms of grammar and poetry that are evidence of the book’s roots in the ancient Near East. This area needs more investigation also, but my initial analysis is that the common Hebraisms that occur in the Book of Mormon (like chiasmus, for example), are heavily weighted to the first half of the book (Mosiah priority order).
“I, X” as in “I, Alma” or “I, Nephi”
Christopher Smith gave good insight in a Sunstone church history podcast episode into this L Voice concept.
In the Bible the whole creation narrative is told in the third person, “in the beginning God created the heavens and the Earth”. But in Joseph’s revision, the whole thing is in the first person, “in the beginning, I created the heaven and the earth”, so throughout the whole Joseph Smith revision, it is saying I, God created this and I, God created that, and this is classic Joseph Smith. He loves that emphatic first person voice. The Book of Mormon if you remember starts out “I, Nephi having been born of goodly parents”. The book of Abraham begins “I, Abraham, saw that it was needful for me to obtain another place of residence.” The record of John in section 93 says “And I, John, bear record that I beheld his glory”. This is all over Joseph’s translations, this emphatic first person voice and it gives them this tremendous voice of authority because we’re getting the message straight from the horse’s mouth.
This came out of the interesting work Skousen-Carmack are doing on Early Modern English grammar and vocabulary in the Book of Mormon. Every time I hear something that could be pattern or trend based, I do a test with my model, and it usually shows strongest in one of the three voices N,S,L. This is a chart of the frequency of the phrase did + verb. ie did go, did eat, did march, did join, did come, etc. The frequency pattern that is most interesting here is by isolating the N Voice in the second chart, since this pattern rarely occurs in the S Voice third chart. My findings so far are that the EModE patterns are heavier in the L Voice than the first half.
My best explanation as to why the N-S-L Model trends appear in the Book of Mormon.
- The Book of Mormon was carefully thought out over the time period prior to 1829. Going into the spring of 1829, there existed the Book of Mormon (most likely not written out but existing in Joseph’s mind in near-final-draft format) which likely ended with the Savior’s visit to the Nephites or included a short wrap up from Mormon but not including the Book of Ether or Moroni. This was likely created by one author (Joseph–or Mormon for traditional believers) intentionally creating a stylistic difference between the narrative and the sermons. The N Voice S Voice split was intentionally created to mimic different authors or speaking style. I believe it’s very skillfully done, but not outside the plausibility of one author. ie now I’m doing narrative, I use N Voice phrases, I’m in third person passive voice, I’m trying to sound very official ancient style copying a style like The Late War. Now I’m doing sermon. I speak in first person and use a lot of rhetorical questions, and I’m riffing on a 19th century revival preacher speaking in KJV language.
- Then as Joseph and Oliver were in the middle of the translation project, Joseph’s updated vision of the Book of Mormon took over and the L Voice shows Joseph Smith breaking from this original text he had in his mind and breaking into new themes (House of Israel themes, covenant, metadata (ie the book talking about itself) and shifting his vocabulary usage into what he thought was more appropriate sounding or simply stylistic drift at a subconscious level.
|Mormon||Mormon||Joseph||Traditional LDS with Joseph exerting himself a bit in the text as he grew more confident|
|Mormon||pre 1829 Joseph||1829 Joseph||Expansion Model|
|pre 1829 Joseph||pre 1829 Joseph||Joseph||Joseph as single author|
|Solomon Spaulding||Sidney Rigdon||Joseph||Spaulding-Ridgon|
|Joseph||Joseph||Oliver||Joseph-Oliver conspiracy with Oliver exerting confidence along the way|
|Oliver||Oliver||Joseph||Joseph-Oliver conspiracy with Joseph taking over more along the way|
|16c translator||16c translator||Joseph||Skousen-Carmack model|
|Mormon||16c translator||Joseph||Skousen-Carmack model|
Using the N-S-L Model to explain other BOM wordprint studies
Case 1 1980 Rencher Larson study
BYU professors Rencher and Larson analyzed the voices in the Book of Mormon based on non-contextual usage frequencies and determined the four main authors: Alma, Nephi, Mormon, and Moroni all are statistically significantly distinct voices.
Here’s the chart they produced. They called one axis the “first discriminant function” and the second axis the “second discriminant function”.
This is a chart I created from my model. In my chart the X axis is the L Voice score (reversed, ie Nephi at beginning and Mormon’s voice covering the start of Mosiah last). Y axis is N:S score, sermon (top) to narrative (bottom).
Not a perfect match, but you can see that my N-S-L model captures the distinction between the voices of Alma, Mormon, Moroni, and Nephi. The important trends in the N-S-L model of first person vs third person, past tense vs present tense, and preference for certain vocabulary are clearly the drivers for both charts.
Case 2 1991 John Hilton with Cal-Berkeley
This study took non-contextual word patterns and compared BOM voices against each other and against external authors such as Joseph, Oliver, Solomon Spaulding, etc. They compared them head to head, for example Nephi vs Oliver, Alma vs Joseph, Nephi vs Alma. They determined that it’s unlikely that any of the external authors wrote the BOM, but more importantly the biggest text pattern difference was Nephi vs Alma.
They determined that the odds that Nephi and Alma were written by the same person are less than 1 in 15 trillion. This example shows why I’m skeptical of statisticians in this field of wordprint studies making claims of statistical significance. In my career as a data scientist, I have seen many statistical studies where statisticians make impressive claims about data sets like this, when in my opinion they are guilty of making too simplistic of assumptions when creating their models. The math is fine, but the math depends on assumptions to create the model. Those assumptions are usually where the models fail.
Like my analysis shows, it’s certain that Nephi and Alma have different wordprint patterns. But it’s difficult to convert that into stating that it’s impossible they’re by the same author. In order to do that, you would have to do a lot of controlling for the obvious variables I note in my N-S-L Model. Did they control for past/present tense? Did they control for first person vs third person (not an issue in Nephi vs Alma but an issue with other BOM voice comparisons)? They counted up all the instances of “and”, “it”, and “to”, but did they account for the fact that Alma used the phrase “and it came to pass” four times while Nephi used it 184 times? Or that Alma uses the phrase “I say unto you” 80 times while Nephi uses it 19 times? There are other subtle, more difficult-to-do-intentionally differences in the two voices, but is it implausible for a single author? I don’t think anyone has properly evaluated that yet.
Case 3 1992 David Holmes British Statistician
David Holmes, a British non-LDS statistician did several studies on the Book of Mormon. His technique was to compare noun richness. He concluded that the Book of Mormon voices (ie Nephi, Alma, Mormon, etc) don’t have statistically different vocabulary range and therefore it is likely the Book of Mormon was written by one author. Although the frequencies vary significantly, the vocabulary range between the N-S-L voices in the Book of Mormon is roughly the same. So this finding fits the N-S-L model.
He then analyzed different authors to compare to the BOM and found that Joseph Smith matched very well. The problem with this is that good writing samples for Joseph Smith before the BOM doesn’t exist and anything after the BOM, especially other scripture like the Doctrine and Covenants, could be affected by Joseph intentionally or unintentionally mimicking BOM style.
Case 4 1993 Metcalfe Mosiah Priority
In the 1993 book New Approaches to the Book of Mormon, Brent Metcalfe makes several logical points arguing for the Mosiah Priority Theory. A couple of these are the same observations (more than 20 years later and with a lot better technology to enable me) I made with the “L Voice” trend. I initially discovered this by analyzing the N vs S trends, I noticed many linguistic patterns held very strong for the first half of the book (Mosiah Priority), but then became weaker. When I dug into why, I noticed many linguistic patterns that bucked this pattern and followed a different pattern, which was a simple chronological trend from beginning of book to end.
- Vocabulary usage. Metcalfe noticed a shift in vocabulary usage “whosoever” to “whoso”, “therefore” to “wherefore”, and “often” to “oft”. Christopher Smith in a blog post in 2012 added a new observation for “insomuch” to “inasmuch”. I’ve identified dozens of other n-grams that trend in this same way.
- Shift in doctrinal emphasis and change in themes. Metcalfe gives a few examples: doctrine of baptism, need and definition of a “church”. A few more I noticed that trend according to Mosiah Priority: House of Israel references and covenant/restoration theology, importance of a prophet, scripture and writing as theme.
Case 5 2008 Jockers Criddle Stanford Study
In 2008, Craig Criddle (an Exmormon Spalding-Rigdon theory proponent) and Matt Jockers (non-LDS Stanford wordprint expert) combined to publish a well known study that they claimed strongly pointed to the Spalding-Rigdon theory of BOM authorship.
They took many authors (Joseph, Oliver, Sidney, Isaiah, and many control authors) and tested them against the BOM. What they found is exactly what my N-S-L Voice model would predict. The model’s going to group all or most of the S Voice together and match up with the comparable that uses that style the closest. In Sidney’s sample, he uses Bible language, he writes in the first person a lot, and addresses the reader with rhetorical questions like “If ye say this?” It’s a very distinct style. The Jockers study matched this up very well with the S Voice portions of the BOM. This doesn’t mean he wrote the BOM S Voice, but if you set up a closed experiment, a model will match up S voice with that kind of writing. That’s what happened in the Jockers study.
For the N Voice chapters in the BOM, their study grouped them mostly together and matched up with the closest, and that happened to be Solomon Spaulding’s style. Spaulding writes in a narrative style, he uses past tense and third person, and often uses a passive voice. “after they had spoken”. It doesn’t mean Spaulding wrote the N Voice. It means his sample was closer to the N Voice portions of the BOM than Rigdon, Cowdery, Longfellow, Barlow, and Isaiah.
Faithful LDS statisticians ripped this study apart. They showed that using the same methodology, Rigdon must have wrote the Federalist Papers. The study has been completely written off. But it might be good to pull it back out and understand it was identifying something important.
Here’s a chart produced from the Jockers study, showing the Book of Mormon chapter by chapter red if identified as a Spalding match and blue if identified as a Rigdon match.
The next chart is from my model, identifying each chapter as Narrative or Sermon using the model scoring not the actual BOM voices database breakout. Blank is for the Isaiah chapters or where the S-N signal was mixed or weak. The Spalding or N Voice concentration is strongest during the war chapters of Alma. The Rigdon or S Voice chapters come strong in certain areas: end of 2nd Nephi, King Benjamin’s address, Abinadi’s preaching, Moroni’s sermon inserts, etc. As is expected with the L Voice theory, the N-Spalding signal is weakest where L Voice drowns it out a bit in the small plates of Nephi and again starting 3rd Nephi to the end of the book.
Case 6 2014 Chris and Duane Johnson Big Data Study
Brothers Chris and Duane Johnson used big data to crunch through 100,000 books written prior to 1830 to identify the closest matches to the BOM in terms of wordprint patterns. Among their closest hits were two books written in the “ancient style”: Gilbert J. Hunt, The late war, between the United States and Great Britain, from June 1812, to February 1815, Gilbert Hunt, 1816 and The First Book of Napoleon, the Tyrant of the Earth: Written in the 5813th Year of the World, Modeste Gruau, 1809. The ancient style was a genre of writing somewhat popular in colonial America where a writer would mimic King James Bible language to make the narrative sound official.
I shared my N-S model data with Chris, and he was able to produce a similar replication of the Jockers Spaulding-Rigdon breakout that matched my N-S using The Late War as the N Voice and Qur’an as the S Voice.
Case 7 2018 Fields Roper Study
In 2018, Book of Mormon Central team published an article and Youtube video discussing a BYU Book of Mormon wordprint project led by Matthew Roper and Paul Fields. Their work appears to be an extension of the original 1980 Rencher Larson study.
They compared the Book of Mormon to 8 novels from four 19th century writers (Cooper, Dickens, Austen, Twain) and separated voices (ie Twain’s Tom Sawyer, Huck Finn, etc) from inside those novels to compare how those stack up against the BOM voices (ie Alma, Mormon, Jacob, etc). And when you take all the voices of 8 sample novels, they have many unique voices, but the combined diversity of all 8 is even less than the Book of Mormon. They claim their research shows 21 statistically unique voices in the Book of Mormon. This is a chart they published from that study laying out those 21 voices.
And here is a similar chart I generated using my N-S-L model.
For my chart, X axis left to right is strongest N on the left to strongest S on the right. Y axis top to bottom is the L Voice signal, which correlates to placement in the book from beginning to end (Mosiah priority).
The big bubbles (size of bubble correlates to total words attributed to this voice in the BOM) Mormon, Moroni, Nephi, and Alma are all generally in the right spots. Even the small bubbles (Enos, Zeniff, Helaman, Samuel, Abinadi) are in the right spots relative to the larger bubbles; impressive since the word sample size is small for those voices. Without knowing anything about the Roper-Fields methodology, I can guarantee that the important elements of their model are the same important elements in the N-S-L model.
To illustrate further, let’s break these out into four quadrants.
Mormon, Zeniff, and Helaman 1 are the N voice of the first half of the Book of Mormon.
Alma, King Benjamin, Abinadi, Amulek are the S Voice of the first half of the Book of Mormon.
Moving down vertically where L Voice signal is stronger, the last half of the Book of Mormon, we see the narrators Moroni, Enos, and Nephi that focused primarily on history/narrative.
In the bottom right quadrant, we have S dominant voices in the second half of the Book of Mormon: Nephi 2 (from the Book of Helaman) and Samuel the Lamanite and then Jacob and Lehi.
I make the same suggestions to Fields-Roper as I do above to the Hilton study. We need to rerun the analysis controlling for the N-S-L attributes that seems easy for an author to use to intentionally create different voices, ie past vs present tense, passive vs active verbs, first person vs third person, and removing common phrases like “I say unto you” or “it came to pass”.
Here are the conclusions I’ve made through this analysis.
Impact on authorship theories:
Faithful LDS view. I think the model rules out multiple, ancient voices coming through to the modern text, but it doesn’t rule out all faithful views. It fits well with a Skousen-Carmack 16c human translator responsible for S and N and Joseph Smith responsible for L. Or somehow the original gold plates text already contains the S-N split, such as possibly Mormon redacted more than he says he did.
Non-historical view. Joseph as single human creator/translator. I think it’s reasonable that one person could be responsible for the three voices, but I think it might change how we think of the length of the time of the creation process and possibly whether there was a written draft prior to 1829 to allow for a larger impact of stylistic drift. This reflects best my own view, though I actually think the N-S-L model makes best sense with at least two creators.
Spaulding-Rigdon view. the N-S-L Model is a home run for conspiracy theories, especially Spaulding-Rigdon. N is created first, independently by Spaulding. Rigdon gets hold of his manuscript. Rigdon spends years stitching in his Campbellite restorationist views as the S Voice inserts. Then Joseph comes along and adds his prophetic vision through the L Voice. The data modeler in me wants to believe. But the lack of evidence for the conspiracy and my view of Joseph as prophet of the restoration stop me from taking this too seriously.
Oliver-Joseph conspiracy view. This also fits nice with the N-S-L Model. Oliver hears Joseph is translating ancient religious record. Oliver whips up his draft, responsible for N-S. He shows up in Palmyra and then Joseph’s collaboration comes through in L Voice.
Each BOM statistical study can provide interesting insights into the text, but we should be wary about making absolute claims and especially attaching statistical significance to these.
The Three Voice N-S-L Model trends can be used to explain and interpret the findings of all Book of Mormon computer studies to date.
It’s extremely unlikely that multiple, distinct, ancient voices have been preserved through the translation of the Book of Mormon into the modern text. The N-S-L model provides strong evidence that there is a maximum of three distinct voices (N,S,L) in the Book of Mormon, with a minimum of one (L) of these being a modern translator/author.
A fruitful area for future BOM wordprint studies would be to determine if N vs S represents a distinction outside the norms of single authorship. I recommend isolating Mosiah and Alma for this study, as the L Voice corrupts this pure distinction. Focus for this study would be to compare narrative vs dialogue.
Another fruitful area for future BOM wordprint studies would be to determine if L Voice vs original (first half of BOM) represents a distinction outside the norms of single authorship. For this, stylistic drift would be the focus of study.
Any Book of Mormon authorship theory needs to adequately explain the trends related to the Three Voice N-S-L model, specifically 1) the difference between N and S, focusing on the first half of the BOM where the distinction is most pure and 2) the L Voice that trends start to finish according to Mosiah Priority.