Determining Word Frequency

by Allen Wyatt
(last updated June 26, 2018)

4

As you are analyzing your documents, you may wonder if there is a way to create a word frequency list. In other words, you may want to generate a list of every unique word in your document, along with the number of times it appears.

Unfortunately, Word doesn't include such a feature. You can, however, create your own using a macro. The following VBA macro is an example:

Sub WordFrequency()
    Dim SingleWord As String           'Raw word pulled from doc
    Const maxwords = 9000              'Maximum unique words allowed
    Dim Words(maxwords) As String      'Array to hold unique words
    Dim Freq(maxwords) As Integer      'Frequency counter for unique words
    Dim WordNum As Integer             'Number of unique words
    Dim ByFreq As Boolean              'Flag for sorting order
    Dim ttlwds As Long                 'Total words in the document
    Dim Excludes As String             'Words to be excluded
    Dim Found As Boolean               'Temporary flag
    Dim j As Integer                   'Temporary variables
    Dim k As Integer                   '
    Dim l As Integer                   '
    Dim Temp As Integer                '
    Dim tword As String                '

    ' Set up excluded words
    Excludes = "[the][a][of][is][to][for][this][that][by][be][and][are]"

    ' Find out how to sort
    ByFreq = True
    ans = InputBox$("Sort by WORD or by FREQ?", "Sort order", "WORD")
    If ans = "" Then End
    If UCase(ans) = "WORD" Then
        ByFreq = False
    End If
    
    Selection.HomeKey Unit:=wdStory
    System.Cursor = wdCursorWait
    WordNum = 0
    ttlwds = ActiveDocument.Words.Count

    ' Control the repeat
    For Each aword In ActiveDocument.Words
        SingleWord = Trim(LCase(aword))
        If SingleWord < "a" Or SingleWord > "z" Then SingleWord = ""    'Out of range?
        If InStr(Excludes, "[" & SingleWord & "]") Then SingleWord = "" 'On exclude list?
        If Len(SingleWord) > 0 Then
            Found = False
            For j = 1 To WordNum
                If Words(j) = SingleWord Then
                    Freq(j) = Freq(j) + 1
                    Found = True
                    Exit For
                End If
            Next j
            If Not Found Then
                WordNum = WordNum + 1
                Words(WordNum) = SingleWord
                Freq(WordNum) = 1
            End If
            If WordNum > maxwords - 1 Then
                j = MsgBox("The maximum array size has been exceeded. _
                  Increase maxwords.", vbOKOnly)
                Exit For
            End If
        End If
        ttlwds = ttlwds - 1
        StatusBar = "Remaining: " & ttlwds & "     Unique: " & WordNum
    Next aword

    ' Now sort it into word order
    For j = 1 To WordNum - 1
        k = j
        For l = j + 1 To WordNum
            If (Not ByFreq And Words(l) < Words(k)) Or 
               (ByFreq And Freq(l) > Freq(k)) Then k = l
        Next l
        If k <> j Then
            tword = Words(j)
            Words(j) = Words(k)
            Words(k) = tword
            Temp = Freq(j)
            Freq(j) = Freq(k)
            Freq(k) = Temp
        End If
        StatusBar = "Sorting: " & WordNum - j
    Next j

    ' Now write out the results
    tmpName = ActiveDocument.AttachedTemplate.FullName
    Documents.Add Template:=tmpName, NewTemplate:=False
    Selection.ParagraphFormat.TabStops.ClearAll
    With Selection
        For j = 1 To WordNum
            .TypeText Text:=Trim(Str(Freq(j))) & vbTab & Words(j) & vbCrLf
        Next j
    End With
    System.Cursor = wdCursorNormal
    j = MsgBox("There were " & Trim(Str(WordNum)) & _
        " different words ", vbOKOnly, "Finished")
End Sub

When you open a document and run this macro, you are asked if you want to create a list sorted by word or by frequency. If you choose word, then the resulting list is shown in alphabetical order. If you choose frequency, then the resulting list is in descending order based on how many times the word appeared in the document.

While the macro is running, the status bar indicates what is happening. Depending on the size of your document and the speed of your computer, the macro may take a while to complete. (I ran it with a 719-page document with over 349,000 words and it took about five minutes to complete.)

Note that there is a line in the macro that sets a value in the Excludes string. This string contains words that the macro will ignore when putting together the word list. If you want to add words to the list, simply add them to the string, between [square brackets]. Also, make sure the exclusion words are in lowercase.

Note:

If you would like to know how to use the macros described on this page (or on any other page on the WordTips sites), I've prepared a special page that includes helpful information. Click here to open that special page in a new browser tab.

WordTips is your source for cost-effective Microsoft Word training. (Microsoft Word is the most popular word processing software in the world.) This tip (879) applies to Microsoft Word 97, 2000, 2002, and 2003.

Author Bio

Allen Wyatt

With more than 50 non-fiction books and numerous magazine articles to his credit, Allen Wyatt is an internationally recognized author. He is president of Sharon Parq Associates, a computer and publishing services company. ...

MORE FROM ALLEN

Determining Your Serial Number

The serial number assigned to your copy of Excel is valuable. It allows you to get support and is necessary for some ...

Discover More

Changing the Pattern Used in a Data Series

When you create a chart, Excel attempts to assign colors to your data series that it thinks will work best for you. If ...

Discover More

Placing WordArt Over Graphics

WordArt is a program that allows you to insert fancily formatted text, as a graphic, in your document. If you want your ...

Discover More

Do More in Less Time! Are you ready to harness the full power of Word 2013 to create professional documents? In this comprehensive guide you'll learn the skills and techniques for efficiently building the documents you need for your professional and your personal life. Check out Word 2013 In Depth today!

More WordTips (menu)

Determining Differences Between Dates

Do you need to do some simple math using dates in a your macro? One of the easy functions you can use is the DateDiff ...

Discover More

Character Frequency Count

Word collects a wide range of statistics about your documents, but one of the things it doesn't collect is how many times ...

Discover More

Trimming Spaces from Strings

When processing text with a macro, you often need to remove extraneous spaces from the text. VBA provides three handy ...

Discover More
Subscribe

FREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."

View most recent newsletter.

Comments

If you would like to add an image to your comment (not an avatar, but an image to help in making the point of your comment), include the characters [{fig}] in your comment text. You’ll be prompted to upload your image when you submit the comment. Maximum image size is 6Mpixels. Images larger than 600px wide or 1000px tall will be reduced. Up to three images may be included in a comment. All images are subject to review. Commenting privileges may be curtailed if inappropriate images are posted.

What is 9 + 9?

2018-06-05 13:08:44

Julie

The list formatted oddly when I ran the macro as written (due to the template used for the original document). I altered the macro to put the list in a document based on the Normal template.

Under ' Now write out the results
I removed
tmpName = ActiveDocument.AttachedTemplate.FullName
Documents.Add Template:=tmpName, NewTemplate:=False

And replaced it with
Documents.Add

Just in case anyone else is wanting to do that.


2018-02-07 22:47:04

Eugene

I have a product which I created from an operating system where I had 55 pages with 8 columns, headers/footers, pages breaks etc.

I managed to create VBAs where I'm just left with one column, 15 pages, no headers/Footers, pages breaks etc.

I would love to know if I can modify the link you referenced since the one currently lists the first and last names separately. Ideally I would like to compare the first and last names as one with the duplicates appearing on the new page created.

For example I have a list of nothing but names: Sheridan, Whiteside; Jane, Doe; John, Doe. etc. The last and first names are separated by a comma. It is the whole name I want to compare leaving the duplicates, first and last names on the sheet that was created.

Thank you, Eugene


2017-09-22 06:05:31

Ken Endacott

John

See the tip and discussions at:
https://wordribbon.tips.net/T010761_Generating_a_Count_of_Word_Occurrences


2017-09-21 15:05:38

John

is it possible to have the macro only list if a word is used more than once?


This Site

Got a version of Word that uses the menu interface (Word 97, Word 2000, Word 2002, or Word 2003)? This site is for you! If you use a later version of Word, visit our WordTips site focusing on the ribbon interface.

Newest Tips
Subscribe

FREE SERVICE: Get tips like this every week in WordTips, a free productivity newsletter. Enter your address and click "Subscribe."

(Your e-mail address is not shared with anyone, ever.)

View the most recent newsletter.