Free Lessons
Courses
Seminars
TechHelp
Fast Tips
Templates
Topic Index
Forum
ABCD
 
Home   Courses   TechHelp   Forums   Help   Contact   Merch   Join   Order   Logon  
 
Home > Courses > Access > Developer > D39 > Introduction < D39 | Lesson 01 >
Introduction

Welcome! Searchable Doc Index with Word


 S  M  L  XL  FS  |  Slo  Reg  Fast  2x  |  Bookmark 

Welcome to Microsoft Access Developer Level 39. In this course we will focus on building a searchable document index in Microsoft Access using Word automation and VBA. We will discuss how to import different document types, extract their text, and load keywords into your database for easy searching, including handling exception words. We will cover building the necessary tables, linking documents to customers, setting up queries and subforms, using FollowHyperlink, working with recordsets, optimizing performance, tracking word counts, and adding progress indicators. This course builds on material covered in previous levels and is designed for those who have completed the earlier series.

Navigation

Keywords

Access Developer, searchable document index, keyword indexing, word automation VBA, import Word documents, import PDF text, file IO, recordsets, NoMatch property, exception table, junction table, composite key, search box, word count, query optimization

 

Start a NEW Conversation
 
Only students may post on this page. Click here for more information on how you can set up an account. If you are a student, please Log On first. Non-students may only post in the Visitor Forum.
 
Subscribe
Subscribe to Introduction
Get notifications when this page is updated
 
Intro Welcome to Microsoft Access Developer Level 39. In this course we will focus on building a searchable document index in Microsoft Access using Word automation and VBA. We will discuss how to import different document types, extract their text, and load keywords into your database for easy searching, including handling exception words. We will cover building the necessary tables, linking documents to customers, setting up queries and subforms, using FollowHyperlink, working with recordsets, optimizing performance, tracking word counts, and adding progress indicators. This course builds on material covered in previous levels and is designed for those who have completed the earlier series.
Transcript Welcome to Microsoft Access Developer Level 39 brought to you by AccessLearningZone.com. I am your instructor Richard Rost.

In today's class, we are going to focus on building a searchable document index. That is where we can take different documents and load the text of those documents into our database so we can search on the keywords. If you want to see which customers have documents that have the word elephant in them, then it is a very easy search to do.

We are going to learn how to use Microsoft Word Automation. In VBA, we will be able to click a button. We will have Microsoft Word open up in the background and it will read in the file. Whether it is a PDF, a text file, a Word document, or any document type that Word supports, you can do this with. We will take that document, save it as text, then we will pull the text into our Access database and index all the keywords.

We will build an exception table, of course, so we do not have to bother indexing words like the, in, or at.

We will work with recordsets some more. I know you guys love recordsets. I spent a lot of time on recordsets in earlier lessons. We will learn how to find records while a recordset is open and we will learn about the NoMatch property.

This lesson follows Access Developer 38 and everything that comes before it. I strongly recommend you have completed my Beginner, Expert, Advanced, and Developer series before continuing with this lesson. My lessons are meant to be taken one after the other. Each one builds on the one that came before it. Look at this page if you do not believe me. Read that.

I will be using Microsoft Access 365. It is currently July of 2022, so it is roughly equivalent to Microsoft Access 2021. However, the lessons covered in this course should be applicable going back to, I am going to say, Access 2007. So if you are using something earlier than that, forget it. 2007 and on, all of that should work just fine with what we are covering today.

If you have got questions about the stuff covered in today's class, scroll down to the bottom of the page that you are watching this on and post a comment right there on the lesson page. If you have got other questions that do not necessarily pertain to the topics covered today, but you have got other Access questions or Excel or Word, post them in the forums on my website.

Let us take a few minutes and cover exactly what we are going to do in today's class.

In lesson one, we are going to begin building a document index, a searchable document index. We are going to create the table to hold this information. We are going to use FollowHyperlink to open files. We have done that before. That is nothing new. We are going to then copy and paste the document text into the index table and create a search box so that we can search for documents that have any particular keywords in them. Then we will assign each document to customers so you can go to the customer's record and open up any documents related to that customer.

In lesson two, we are continuing on with our document index. We are going to make a button to browse and pick a file. We will automatically load that document into Microsoft Word in the background using some VBA and some Word automation. Then we will be able to convert any PDF file, Word document file, or text file into plain text that we can pull into our Access database. That is all covered in lesson two.

In lesson three, we are continuing with the document index. We are going to learn some reasons why you would want to build your own index. There are a bunch of reasons why. We will cover them in this lesson. There are a lot of limitations with long text fields. One of them is you cannot edit a long text field of more than 64k. It is 4,000 characters roughly. We are going to learn how to use file input/output from Access Developer 30 to import text files directly. We will open up Word, we will save the Word doc or the PDF as a text file, then we will use file I/O to read that in instead of copying and pasting it. All that is covered in lesson three.

In lesson four, we are continuing on with the document index. We are going to actually build the index tables that hold the data for all the keywords. The index table has the list of keywords in it just once. Then we will build a junction table to track which keywords are in which documents. We will use a composite key to prevent duplicates. We will write an AddToIndex subroutine and filter out unnecessary characters. Lots, lots more.

In lesson five, we are going to set up our document index Q and our document index subform - this little guy right here. You can click on a document and see whatever words are in that document down below. We will set up our exception table, which is a list of words you do not care to index: that, them, there, they, about, its, etc. Then we will make sure that the search now works taking the index table into consideration, which means we have to change the record source when we go between not searching or showing all the documents. It is a little tricky. A lot of good query stuff in this lesson and we are going to cover that all in lesson five. Here we go.

Finally, in lesson six, we are going to finish up this document index. We are going to display the percent completed right there. So you are going to see 73 percent completed while this index is building, which is great for you if you get longer documents and they take a while, like this big one that we did down here. It could take a few minutes to compile. That is what we are going to spend some time with at the end of the lesson.

We are going to optimize this. We are going to try and replace some of this SQL we have been writing with recordsets. We are going to use FindFirst and NoMatch, which is this new recordset stuff, folks. New recordset stuff. I know you guys love recordsets.

We are going to open dbOpenDynaset, which I forgot about when I was writing a code the first time. Then I got an error message and had to fix it. I know you guys love my error messages too.

We are going to add a word count. That is another reason for adding the recordset loop here. So you could see, OK, the word computer was in here in this document nine times, the word people is in it seven times.

We will do a lot more recordset coding. So this will conclude our document index, but this lesson has got a lot of cool stuff in it and some new material.

Here we go.
Quiz Q1. What is the main focus of Access Developer Level 39?
A. Building a searchable document index in Microsoft Access
B. Creating sales reports in Access
C. Developing a customer relationship management form
D. Designing navigation menus

Q2. What method will be used to pull text from different types of documents into Access?
A. ODBC Data Import
B. Microsoft Word Automation using VBA
C. Manual entry only
D. Access Macro Actions

Q3. Which type of documents can be converted and indexed using the techniques in this class?
A. Only Word documents
B. Only PDF files
C. Any file type supported by Microsoft Word
D. Only text files

Q4. What is the primary purpose of the exception table mentioned in the class?
A. To store customer information
B. To list customers who have overdue documents
C. To store words that should not be indexed
D. To hold sales tax rates

Q5. Why is it necessary to create a junction table in the context of the document index?
A. To prevent users from editing documents
B. To track which keywords are found in which documents
C. To delete documents from the database
D. To save back-up copies of each document

Q6. What issue with long text fields in Access is specifically mentioned in the lesson?
A. Text fields are not searchable at all
B. You cannot edit long text fields over 64,000 characters
C. Long text fields automatically format text as bold
D. Text fields accept only numbers

Q7. What is a benefit of using file input/output (file I/O) as described in Lesson 3?
A. It allows for direct use of binary images
B. It speeds up form design in Access
C. It enables importing text files directly into Access tables
D. It encrypts all database content

Q8. What is the 'NoMatch' property used for in relation to recordsets?
A. To determine if a recordset has been updated
B. To check if a search within an open recordset did not find a matching record
C. To format text for printing
D. To lock database tables

Q9. What does the percent completed feature show when building the index?
A. The number of customers linked to documents
B. The amount of time left before Access closes
C. The percentage of the document indexing task completed
D. The size of the Access database

Q10. What additional information is added when tracking word occurrences in documents?
A. The file path of each document
B. The date the document was created
C. The number of times each word appears in a document
D. The owner of the document

Q11. Why should students have completed the prior courses (Beginner, Expert, Advanced, and Developer series) before this lesson?
A. Because later lessons include easier material
B. Because each lesson builds on previous concepts and techniques
C. Because exam questions are only covered in earlier lessons
D. Because Microsoft Access 365 is required

Q12. Which Access versions are these techniques expected to work with?
A. Only Access 365
B. Access versions before 2000
C. Access 2007 and newer
D. Only Access for Mac

Answers: 1-A; 2-B; 3-C; 4-C; 5-B; 6-B; 7-C; 8-B; 9-C; 10-C; 11-B; 12-C

DISCLAIMER: Quiz questions are AI generated. If you find any that are wrong, don't make sense, or aren't related to the video topic at hand, then please post a comment and let me know. Thanks.
Summary Today's video from Access Learning Zone is Microsoft Access Developer Level 39. I am your instructor, Richard Rost.

In this lesson, the focus is on creating a searchable document index. The goal is to load various types of documents into your Access database so you can search for keywords within those documents. Imagine you want to locate all customer documents that mention the word "elephant" - with this system, performing that search becomes quick and simple.

We will cover Microsoft Word Automation in VBA, allowing you to press a button that opens Word in the background. Word will read the contents of files, whether they are PDFs, text files, Word documents, or any other type supported by Word. Once Word converts the document into plain text, we will import that text into our Access database. From there, we will index all the keywords so searching is more efficient.

To make the index meaningful, we will build an exception table to exclude common words like "the," "in," and "at" from our indexing process.

This class also involves working with recordsets again. I have emphasized recordsets in previous lessons, and now we will add to that with techniques for searching through records while a recordset is open and using the NoMatch property.

This lesson builds directly on Access Developer 38 and previous videos. I strongly recommend you have completed all of my Beginner, Expert, Advanced, and Developer series lessons before continuing here, as each class builds on the material that came before it. My videos are designed to be followed in sequential order.

For reference, I am using Microsoft Access 365, as of July 2022. The material should apply to Access 2021 as well, and even as far back as Access 2007. If you have an older version, you may encounter compatibility issues.

If you have any questions specifically about this lesson, you can post them in the comments area beneath the video. For other questions about Access, Excel, or Word, visit the forums on my website.

Now let me outline the structure of this course and what you can expect to learn.

In lesson one, we start developing the searchable document index. We will design the table structure to store the text from documents. You will learn to use the FollowHyperlink method to open files, then copy and paste document text into the index table. I will also demonstrate how to create a search box to find documents containing particular keywords. Each document can be linked to customers, so you can view all documents relevant to a specific customer right from their record.

In lesson two, we continue with the document index and add a button for browsing and selecting files. We will automate the process of opening Microsoft Word in the background using VBA, converting various document formats into plain text. That plain text will be brought into Access for indexing. This approach works for PDFs, Word documents, and text files.

In lesson three, we dive deeper into building your own document index and discuss why this approach is beneficial. I will explain some limitations with long text fields, such as editing restrictions when exceeding about 4,000 characters. We will look at using file input and output, a topic I first covered in Access Developer 30, to import text files directly. The process involves having Word or PDF files saved as text, then using file I/O to bring that content into Access instead of relying only on copy and paste.

Lesson four is where we create the keyword index tables in the database. The main index table will hold each keyword only once, and a junction table will track which keywords appear in which documents. This involves using a composite key to avoid duplicates. We will write an AddToIndex subroutine, filter out special characters, and cover related topics.

Lesson five focuses on setting up the document index query and the document index subform. This allows you to click on a document and view all associated keywords underneath. We will build an exception table for words that should not be indexed, such as "that," "them," "they," "about," or "its." The search function will be updated to take the index table into account. This does require some changes to the record source, depending on whether you are searching or viewing all documents. There is plenty of practical query work in this section.

Lesson six wraps up the document index project. We will display the percentage completed while building the index, which is especially helpful for larger documents that can take a while to process. This part will also address code optimization, swapping out some SQL statements for more efficient recordset code. We will use FindFirst and NoMatch properties of recordsets. If you've watched my earlier coding videos, you know I sometimes forget to open dbOpenDynaset and run into errors, but we'll address those here as well.

Additionally, we will implement a word count feature. This means you'll not only know which keywords appear in each document, but also how many times each word is used. This is another example of where looping through recordsets is necessary.

By the end of this lesson, you will have a fully functional, searchable, and efficient document index system in your Access database, complete with keyword tracking, customer association, and progress monitoring. There is plenty of new material in this lesson, and I look forward to sharing it with you.

You can find a complete video tutorial with step-by-step instructions on everything discussed here on my website at the link below. Live long and prosper, my friends.
Topic List Building a searchable document index in Access
Using Microsoft Word Automation with VBA
Loading document text into an Access database
Creating a table to store document text
Using FollowHyperlink to open files
Copying and pasting document text into table
Creating a search box for document keywords
Assigning documents to customers
Browsing and selecting files with a button
Automatically loading documents into Word via VBA
Converting PDF, Word, and text files to plain text
Importing extracted plain text into Access
Understanding limitations of long text fields
Using file input/output to import text files
Building index tables for keywords
Creating a junction table linking keywords and documents
Using composite keys to prevent duplicate records
Writing an AddToIndex subroutine
Filtering out unnecessary characters from keywords
Setting up an exception table for common words
Displaying indexed keywords in a subform
Adjusting search queries using the index table
Showing index build percent completion
Optimizing queries with recordsets
Using FindFirst and NoMatch with recordsets
Adding word count functionality to the index
Article In this tutorial, we are going to build a searchable document index in Microsoft Access. This means you will be able to bring different types of documents into your database, extract their text, and search for keywords within that text. For example, if you want to find all customer documents that mention the word elephant, this system will let you do that quickly and easily.

To achieve this, we will use Microsoft Word Automation in VBA. With Word Automation, you can programmatically open Word in the background, load a document, and extract its text. This approach is powerful because Microsoft Word can open and convert a wide range of document types, including Word files, PDF files, and plain text files. By using Word Automation, we can open a chosen document, save it as plain text, then read that text into our Access database for indexing.

First, you need to create a table in your Access database that will hold information about your documents. Typically, this table will have fields like DocumentID, CustomerID, DocumentName, DocumentText, and any other relevant information such as upload date. You should also have a Customers table if you are linking documents to specific customers.

To open documents directly from your database, you can use the FollowHyperlink command in VBA. This command allows you to open files, folders, and URLs from within Access. You might have a button on your form that, when clicked, runs code like the following:

Application.FollowHyperlink Me.txtFilePath

Assuming txtFilePath is a textbox containing the full file path, this will open the selected document using its default application.

Next, we want to automate the process of extracting text from documents. To do this, you must first add a reference to the Microsoft Word Object Library in your Access VBA editor. Go to Tools, References, and check Microsoft Word xx.x Object Library, matching your version number.

With that set up, you can create a VBA subroutine to open a document in Word, save it as plain text, then read that text file back into your database. Here is an example outline of how that code might look:

Sub ImportDocumentText(strFilePath As String)
Dim wdApp As Word.Application
Dim wdDoc As Word.Document
Dim strTempTextFile As String
Dim strTextContents As String
Set wdApp = New Word.Application
wdApp.Visible = False
Set wdDoc = wdApp.Documents.Open(strFilePath)
strTempTextFile = Environ("Temp") & "\temptext.txt"
wdDoc.SaveAs2 FileName:=strTempTextFile, FileFormat:=wdFormatText
wdDoc.Close False
wdApp.Quit
Set wdDoc = Nothing
Set wdApp = Nothing
Dim intFile As Integer
intFile = FreeFile
Open strTempTextFile For Input As #intFile
strTextContents = Input$(LOF(intFile), intFile)
Close #intFile
Kill strTempTextFile
' Now save strTextContents to your database
' Example: CurrentDb.Execute "UPDATE DocumentT SET DocumentText = '" & Replace(strTextContents, "'", "''") & "' WHERE DocumentID = " & Me.DocumentID
End Sub

You can call this subroutine from a button, passing in the file path of the document you want to index.

Now that you have the document text in your database, you will want to create a way to search for keywords. This can start with a simple form with a textbox for entering a keyword and a search button. The search button can run a query that looks for the keyword inside the DocumentText field. For example, your query might look like:

SELECT * FROM DocumentT WHERE DocumentText LIKE "*" & [Forms]![frmSearch]![txtKeyword] & "*"

This basic approach works, but for better performance and more advanced features, you will want to build an index of keywords. To do that, create a new table for the unique keywords, for example, KeywordT with KeywordID and Keyword. Then create a junction table, DocumentKeywordT, with DocumentID and KeywordID, to track which keywords appear in each document. Use a composite primary key (DocumentID, KeywordID) to avoid duplicates.

To build the index, write a VBA subroutine that goes through the DocumentText, splits the text into words, checks each word against a list of exceptions like "the", "and", "or", and indexes only the useful keywords. Create another table, ExceptionT, and fill it with the words you do not want to index.

Here is a sample outline of what your indexing subroutine might look like:

Sub AddToIndex(docID As Long, docText As String)
Dim arrWords() As String
arrWords = Split(docText, " ")
Dim db As DAO.Database
Dim rsKeyword As DAO.Recordset
Dim rsDocKeyword As DAO.Recordset
Dim rsException As DAO.Recordset
Set db = CurrentDb
Set rsKeyword = db.OpenRecordset("KeywordT", dbOpenDynaset)
Set rsDocKeyword = db.OpenRecordset("DocumentKeywordT", dbOpenDynaset)
Set rsException = db.OpenRecordset("ExceptionT", dbOpenSnapshot)
Dim word As String
Dim keywordID As Long
For Each word In arrWords
word = Trim(LCase(word))
word = Replace(word, ".", "")
word = Replace(word, ",", "")
' Add more replacements for other punctuation as needed
rsException.FindFirst "ExceptionWord = '" & word & "'"
If rsException.NoMatch And word <> "" Then
rsKeyword.FindFirst "Keyword = '" & word & "'"
If rsKeyword.NoMatch Then
rsKeyword.AddNew
rsKeyword!Keyword = word
rsKeyword.Update
rsKeyword.FindFirst "Keyword = '" & word & "'"
End If
keywordID = rsKeyword!KeywordID
rsDocKeyword.FindFirst "DocumentID = " & docID & " AND KeywordID = " & keywordID
If rsDocKeyword.NoMatch Then
rsDocKeyword.AddNew
rsDocKeyword!DocumentID = docID
rsDocKeyword!KeywordID = keywordID
rsDocKeyword!Count = 1
rsDocKeyword.Update
Else
rsDocKeyword.Edit
rsDocKeyword!Count = rsDocKeyword!Count + 1
rsDocKeyword.Update
End If
End If
Next word
rsKeyword.Close
rsDocKeyword.Close
rsException.Close
Set rsKeyword = Nothing
Set rsDocKeyword = Nothing
Set rsException = Nothing
Set db = Nothing
End Sub

This code assumes you have added a "Count" field to your DocumentKeywordT table, so you can keep track of how many times each word appears in the document.

As you build your document index, you may want to display progress to your users, especially with large documents. You can update a progress bar or percentage display as you go through the words so that users know how far along the process is.

Once your index tables and code are in place, you can build search forms that let users enter one or more keywords and see all documents that contain those words. You can even display all the keywords found for a selected document in a subform. This gives users a powerful way to explore and filter documents based on their content.

Throughout this process, you will make heavy use of recordsets in VBA. Recordsets allow you to read, find, edit, and add records dynamically as you process the document text and build your index. The FindFirst method helps you locate a specific record, and the NoMatch property indicates if a search found any results. Using dbOpenDynaset is important so that your recordset is updateable.

As you refine your system, you may want to optimize your queries and code further, perhaps replacing certain SQL-based operations with recordset-based loops for greater control and efficiency. If you run into errors, remember to watch for issues like accidentally opening a recordset as read-only when you need to edit it. Adding error handling can also help make your code more robust.

Finally, by building this searchable index, you create a flexible system where you can link documents to customers, easily search for documents containing any keywords, and even display word counts for analytics. This approach will work in recent versions of Access, including Access 2007 and newer, and uses powerful VBA automation with Word to make the process seamless. With these steps and examples, you should have a solid foundation for creating an efficient and useful document indexing system in your Access applications.
 
 
 

The following is a paid advertisement
Computer Learning Zone is not responsible for any content shown or offers made by these ads.
 

Learn
 
Access - index
Excel - index
Word - index
Windows - index
PowerPoint - index
Photoshop - index
Visual Basic - index
ASP - index
Seminars
More...
Customers
 
Login
My Account
My Courses
Lost Password
Memberships
Student Databases
Change Email
Info
 
Latest News
New Releases
User Forums
Topic Glossary
Tips & Tricks
Search The Site
Code Vault
Collapse Menus
Help
 
Customer Support
Web Site Tour
FAQs
TechHelp
Consulting Services
About
 
Background
Testimonials
Jobs
Affiliate Program
Richard Rost
Free Lessons
Mailing List
PCResale.NET
Order
 
Video Tutorials
Handbooks
Memberships
Learning Connection
Idiot's Guide to Excel
Volume Discounts
Payment Info
Shipping
Terms of Sale
Contact
 
Contact Info
Support Policy
Mailing Address
Phone Number
Fax Number
Course Survey
Email Richard
[email protected]
Blog RSS Feed    YouTube Channel

LinkedIn
Copyright 2026 by Computer Learning Zone, Amicron, and Richard Rost. All Rights Reserved. Current Time: 6/30/2026 4:08:35 AM. PLT: 1s
Keywords: Access Developer, searchable document index, keyword indexing, word automation VBA, import Word documents, import PDF text, file IO, recordsets, NoMatch property, exception table, junction table, composite key, search box, word count, query optimization  PermaLink  How To Build a Searchable Document Index With Word Automation and Recordsets in Microsoft Access