Results 1 to 6 of 6

Thread: How to extract data from word document?

  1. #1
    Join Date
    Jun 2011

    How to extract data from word document?

    Since few days I am trying to do modification I macro. The reason behind this is that to extract data from Microsoft Word documents. The macro which I used is as follows

    I am going to modify a macro to extract data from MS Word documents. I run below mentioned macro but in the beginning where there is :

    Sub GetDataFromWord()
    'Set a reference (Tools - References) to the
    'Microsoft Word x.0 Object Library
    Dim wdApp As Word.Application
    Dim wdDoc As Word.Document
    Dim sFile As String
    Dim rInput As Range
    'Define row and column of data in table
    Const lROW As Long = 2
    Const lCOL As Long = 2
    'Specify file that contains table
    sFile = "C:\Documents and Settings\Hashemi\Desktop\Macro test\test.doc"
    'instantiate Word and open document
    Set wdApp = New Word.Application
    Set wdDoc = wdApp.Documents.Open(sFile)
    'define range where data goes
    Set rInput = Sheet1.Range("a1")
    'Copy value from table and paste to cell
    With wdDoc.Tables(1)
    .Cell(lROW, lCOL).Range.Copy
    rInput.PasteSpecial xlPasteValues
    End With
    wdDoc.Close False
    Set wdDoc = Nothing
    Set wdApp = Nothing
    • ('Set a reference (Tools - References) to the
      'Microsoft Word x.0 Object Library)

      And also error on:
    • (Dim wdApp As Word.Application)

    How should I solve this?

  2. #2
    Join Date
    May 2009

    Re: How to extract data from word document?

    You can use many languages to extract data from Microsoft word, languages such as COM. Perl and Python because these languages have COM modules that you can use to extract data from MS word.
    import glob,os,win32com.client
    wordapp = win32com.client.Dispatch("Word.Application")
    path = "D:\\mydir"
    for files in glob.glob("*.doc"):
        doc = os.path.abspath(os.path.join(path, files))
        print "processing " , doc   
        txt = doc[:-3] + 'txt'             
        wordapp.ActiveDocument.Close( )

  3. #3
    Join Date
    Apr 2009

    Re: How to extract data from word document?

    You can use the below script to extract your important data from Microsoft word document. This script is an explicit option script that extracts data.
    Option Explicit
    REM We use "Option Explicit" to help us check for coding mistakes
    REM the Word Application
    Dim objWord
    REM the path to the Word file
    Dim wordPath
    REM the document we are currently reading data from
    Dim currentDocument
    REM the number of Words in the current document
    Dim numberOfWords
    Dim i
    REM where is the Word file located?
    wordPath = "C:\Data\Doc1.doc"
    WScript.Echo "Extract Data from " & wordPath
    REM Create an invisible version of Microsoft Word
    Set objWord = CreateObject("Word.Application") 
    REM don't display any messages about documents needing to be converted
    REM from  old Word file formats
    objWord.DisplayAlerts = 0
    REM open the Word document as read-only
    REM open (path, confirmconversions, readonly
    objWord.Documents.Open wordPath, false, true
    REM Access the document
    Set currentDocument = objWord.Documents(1)
    REM How many words are in the document
    NumberOfWords = currentDocument.Words.count
    WScript.Echo "There are " & NumberOfWords & " words " & vbCRLF
    For i = 1 to NumberOfWords
    	WScript.Echo currentDocument.Words(i)
    REM Close the document
    REM Free memory used to store the document object
    Set currentDocument = Nothing
    REM exit Microsoft Word
    Set objWord = Nothing

  4. #4
    Join Date
    May 2009

    Re: How to extract data from word document?

    If the above code does not help you to extract data from Microsoft word than try this code. I am sure this will help. I have personally edited for you. So you just need to copy paste the below vba code.
    Sub ExtractData()
    Dim sDTE As String
    Dim sSubject As String
    Dim strFileName As String
    Dim strPath As String
    Dim oDoc As Document
    Dim dataDoc As Document
    Dim fDialog As FileDialog
    Set fDialog = Application.FileDialog(msoFileDialogFolderPicker)
    'Pick the folder with the letters
    With fDialog
        .Title = "Select Folder containing the documents to be modifed and click OK"
        .AllowMultiSelect = False
        .InitialView = msoFileDialogViewList
        If .Show <> -1 Then
             MsgBox "Cancelled By User"
             Exit Sub
        End If
        strPath = fDialog.SelectedItems.Item(1)
        If Right(strPath, 1) <> "\" Then strPath = strPath + "\"
    End With
    'Close any open documents
    If Documents.Count > 0 Then
        Documents.Close SaveChanges:=wdPromptToSaveChanges
    End If
    strFileName = Dir$(strPath & "*.do?")
    'Assign the name of the document to take the data
    Documents.Open ("""D:\My Documents\Test\DTE data.doc""")
    Set dataDoc = ActiveDocument
    'Open the letters in turn
    While strFileName <> ""
        Set oDoc = Documents.Open(strPath & strFileName)
        Selection.HomeKey wdStory 'Start from the top of the letter
        With Selection.Find 'find the first string
            Do While .Execute(findText:="DTE/*^13", _
                 MatchWildcards:=True, _
                 Wrap:=wdFindStop, Forward:=True) = True
                'Assign the found text to a variable and chop off
                'the last character - 
                sDTE = Left(Selection.Range, Len(Selection.Range) - 1)
        End With
        Selection.HomeKey wdStory 'Start from the top of the letter
        With Selection.Find 'find the second string
            Do While .Execute(findText:="Subject :*^13", _
                      MatchWildcards:=True, _
                      Wrap:=wdFindStop, Forward:=True) = True
               'Assign the second string to a variable and chop off
               'the last character and the leading text
               sSubject = Mid(Selection.Range, 10, Len(Selection.Range) - 10)
        End With
        'Switch to the data document and add the content of
        'the variables to the blank row of the table
        With Selection
            .EndKey wdStory
            .MoveUp Unit:=wdLine, Count:=1
            .MoveRight Unit:=wdCell, Count:=2 'Add a new blank row
            .TypeText Text:=sDTE
            .MoveRight Unit:=wdCell
            .TypeText Text:=sSubject
        End With
        'Close the letter without saving
        oDoc.Close SaveChanges:=wdDoNotSaveChanges
        Set oDoc = Nothing
        strFileName = Dir$()
    'Save the data document
    End Sub

  5. #5
    Join Date
    Apr 2009

    Re: How to extract data from word document?

    If you havenít found any necessary help from the above than you just have to copy paste as I suggest. Copy baste the below vba code in your Excel module After that you also have to select Microsoft Word Object Library from Tools.

    Sub CollateForms()
    Dim myPath As String
    Dim myWord As New Word.Application
    Dim myDoc As Word.Document
    Dim myField As Word.FormField
    Dim n As Long, m As Long
    Dim fs, f, f1, fc
    myPath = InputBox("Path?")
    Set fs = CreateObject("Scripting.FileSystemObject")
    Set f = fs.GetFolder(myPath)
    Set fc = f.Files
    m = 0
    For Each f1 In fc
    n = 0
    Set myDoc = myWord.Documents.Open(myPath & "\" & f1.Name)
    For Each myField In myDoc.FormFields
    ActiveCell.Offset(m, n).Value = myField.Result
    n = n + 1
    myDoc.Close wdDoNotSaveChanges
    m = m + 1
    Set myField = Nothing
    Set myDoc = Nothing
    Set myWord = Nothing
    End Sub

  6. #6
    Join Date
    Nov 2008

    Re: How to extract data from word document?

    The following mentioned code is edited and made by me for you to extract data from word. This code will be beneficial for you to load the doc files in the directory into the excel sheet.

    Sub LoadWordDoc()
    Dim F
    Dim x As Double
    Dim FolderYear As String
    Dim DocMonth As String
    Dim DocPath As String
    Dim FName()
    Dim Ext As String
    '///Load variables values
        DocMonth = Sheet1.Range("b1")
        FolderYear = Sheet1.Range("b2")
    '/// To create the path to search for files
        DocPath = "C:\Documents and Settings\montefem\My Documents\Excel Test\word Document\" + FolderYear + "\"
    '///here i think I should use the varibe DocMonth just to list the doc with Feb in the string but I do not know how.
        Ext = "*.doc"
    '///To load all files from that directory and place in F as an array
        ChDir (DocPath)
        F = Dir(DocPath & Ext)
        Application.DisplayAlerts = False
        x = 2
    '///Clear previous values
    '///to place the files name in the settings sheet in the xls application to manipulate them
        With ActiveSheet
            Do While Len(F) > 0
                ReDim Preserve FName(2, x)
                FName(2, x) = DocPath & F
                Cells(x, "G") = FName(2, x)
                x = x + 1
                F = Dir()
        If x = 1 Then MsgBox "No Files": GoTo 20
        End With
    End Sub

Similar Threads

  1. Blank word document opens while opening original document
    By Loyalpalm in forum Windows Software
    Replies: 6
    Last Post: 01-12-2011, 11:27 PM
  2. Replies: 1
    Last Post: 10-06-2011, 03:31 AM
  3. Replies: 5
    Last Post: 12-03-2010, 11:24 PM
  4. Extract Images in a Word 2007 Document
    By DAGAN in forum Windows Software
    Replies: 2
    Last Post: 27-06-2009, 08:59 PM
  5. Converting Word 97-2003 document to Word 2007
    By Jerry in forum Vista Help
    Replies: 7
    Last Post: 19-05-2008, 03:14 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts