Visual Basic and the XML DOM: An Annotated Example
July 12, 2000
One of the most frequently asked questions we get concerning XML and Visual Basic is about how to work with the Microsoft XML Parser (MSXML.DLL) that comes with Internet Explorer 5. It's fine knowing how to write XML, but one needs to know how to read and manipulate an XML file.
This article takes us through an annotated example of how to use the Microsoft XML Document Object Model (DOM) parser to load a TreeView control in Visual Basic. (Populating a TreeView is a good example, as XML documents themselves are tree-structured.) From there we then go on to explain how to manipulate an XML document by adding and deleting data items.
This article is based on a section of the book "XML Programming with VB and ASP" (Manning Publications) by Mark Wilson and Tracey Wilson. This is an extract from Chapter 4, Section 9.
Download the source code that accompanies this article.
XML DOM Basics
The DOM consists of several interfaces. Every object in the DOM is called a "node," whether it's an element, attribute, CDATA section, comment, or processing instruction.
The DOMDocument object will contain our XML document. From there, we access our XML data via the DOMDocument's documentElement property, which is available from the IXMLDOMElement or IXMLDOMNode interface.
These interfaces have a childNodes property, which is how we iterate through each of the "child" elements, comments, etc., of the parent node. The IXMLDOMNodeList interface allows us to access a collection of nodes, such as the children of an element.
The following diagram shows the hierarchy of these interfaces. Note that only the most common interfaces are shown below:
Figure 1: MSXML DOMDocument Interfaces |
Our Example Program
Working back-to-front, here's what we're aiming for:
Figure 2: DOM Example screenshot |
Clicking on the "Populate People" command button loads and displays the content of an XML "People" document. As you click on each person in the TreeView, the text box contents on the right-hand side will show details relating to the element you click, using the DOM object to get the details.
We've included a web-browser control to the form, so that you can see the changes happening to the XML file as you add and delete elements. To add a new person, click on the "Clear Items for new Person" button. Add your details to the text boxes. Then click on the "Save New Person" button. This creates the new "PERSON" element and adds it to the XML document.
To delete a person, click on a person in the TreeView. Then click on the "Delete Person" button, which will automatically delete the selected person, removing him/her from the TreeView, DOM Object, and the actual XML file.
Running the Example
To run this project, download the source code, then
- copy the contents of the downloaded zip into a directory,
- make sure that you have the "Microsoft XML 2.0" dll installed, and
- run the DOMExample.vbp.
One word of caution before we examine the code: because the DOMDocument and the TreeView use the term "node" for specifying their objects, we've tried to explicitly differentiate between the two in our comments.
Right, let's dig into the code!
Source XML Document
The following XML document has been used for this example:
<?xml version="1.0"?>
<!-- *********** Resumes for People *********** -->
<!DOCTYPE PEOPLE SYSTEM "people2.dtd">
<PEOPLE>
<PERSON PERSONID="p1">
<NAME>Mark Wilson</NAME>
<ADDRESS>911 Somewhere Circle, Canberra, Australia</ADDRESS>
<TEL>(++612) 12345</TEL>
<FAX>(++612) 12345</FAX>
<EMAIL>markwilson@somewhere.com</EMAIL>
</PERSON>
<PERSON PERSONID="p2">
<NAME>Tracey Wilson</NAME>
<ADDRESS>121 Zootle Road, Cape Town, South Africa</ADDRESS>
<TEL>(++2721) 531 9090</TEL>
<FAX>(++2721) 531 9090</FAX>
<EMAIL>Tracey_Wilson@somewhere.com</EMAIL>
</PERSON>
<PERSON PERSONID="p3">
<NAME>Jodie Foster</NAME>
<ADDRESS>30 Animal Road, New York, USA</ADDRESS>
<TEL>(++1) 3000 12345</TEL>
<FAX>(++1) 3000 12345</FAX>
<EMAIL>Jodie_Foster@somewhere.com</EMAIL>
</PERSON>
<PERSON PERSONID="p4">
<NAME>Lorrin Maughan</NAME>
<ADDRESS>1143 Winners Lane, London, UK</ADDRESS>
<TEL>(++94) 17 12345</TEL>
<FAX>++94) 17 12345</FAX>
<EMAIL>Lorrin Maughan@somewhere.com</EMAIL>
</PERSON>
<PERSON PERSONID="p5">
<NAME>Steve Rachel</NAME>
<ADDRESS>90210 Beverly Hills, California, USA</ADDRESS>
<TEL>(++1) 2000 12345</TEL>
<FAX>(++1) 2000 12345</FAX>
<EMAIL>Steve_Rachel@somewhere.com</EMAIL>
</PERSON>
</PEOPLE>
Module Variables
These global variables go into the General Declarations of the project:
General_Declarations
Option Explicit
' The member variables used throughout the project
Private m_objDOMPeople As DOMDocument
Private m_blnItemClicked As Boolean
' constant used throughout the code for when we want to
' save the document
Const XMLPATH As String = "people2.xml"
Populating the TreeView from the DOMDocument
When the user clicks on the "Populate People" button, the following code populates the TreeView with the contents of the DOMDocument:
Private Sub cmdPopulate_Click()
Dim objPeopleRoot As IXMLDOMElement
Dim objPersonElement As IXMLDOMElement
Dim tvwRoot As Node
Set m_objDOMPeople = New DOMDocument
ResolveExternals prevents the m_objDOMPeople object from looking for external files--which in our case is the people2.dtd. Set this to true if you want the parser to find the external files.
m_objDOMPeople.resolveExternals = True
ValidateOnParse can stop the m_objDOMPeople from validating the XML file against the people.dtd file--set this to true if you want validation to occur.
m_objDOMPeople.validateOnParse = True
The XML file first needs to be loaded into the DOMDocument (m_objDOMPeople). We need to load the file synchronously to instruct that the XML file will be completed loaded into the DOM, when loading; therefore we set the async property to false.
m_objDOMPeople.async = False
Call m_objDOMPeople.Load(XMLPATH)
Check that the load of the XML document was successful.
If m_objDOMPeople.parseError.reason <> "" Then
There has been an error with the loaded XML--show the reason.
MsgBox m_objDOMPeople.parseError.reason
Exit Sub
End If
Get to the root element of the XML document--bypassing the comments, PIs, etc.
Set objPeopleRoot = m_objDOMPeople.documentElement
Once we have obtained the root element (documentElement) of the DOMDocument, we can start working with the data from the XML file. The first thing we want to add to the TreeView is the name of the root element of the XML file, which is "PEOPLE". Then we need to start adding child nodes.
Set the TreeView control properties.
tvwPeople.LineStyle = tvwRootLines
tvwPeople.Style = tvwTreelinesPlusMinusText
tvwPeople.Indentation = 400
Check if the TreeView has already been populated: if so, remove the root, which removes everything.
If tvwPeople.Nodes.Count > 0 Then
tvwPeople.Nodes.Remove 1
End If
Add a child to the root node of the TreeView.
Set tvwRoot = tvwPeople.Nodes.Add()
Now all we need to do is iterate through these childNodes of the DOMDocument's "PEOPLE" elements, which is going to return five "PERSON" elements. This is done by the "For Each" loop, which leads us to the code in populateTreeWithChildren method.
Iterate through each "PERSON" element in the DOM to fill the tree, which in itself iterates through each childNode of that element (objPersonElement) to drill down into its childNodes.
For Each objPersonElement In objPeopleRoot.childNodes
populateTreeWithChildren objPersonElement
Next
webTarget.Navigate XMLPATH
cmdDelete.Enabled = True
cmdClear.Enabled = True
End Sub
The populateTreeWithChildren Subroutine
This code is called from the cmdPopulate click event. For each parent node created on the TreeView, drill down into the DOM Element that has been passed in and populate the TreeView with these DOMElement childNodes.
The parameter, objDOMNode, is the current child node from the docElement property of the DOMDocument.
Dim objNameNode As IXMLDOMNode
Dim objAttributes As IXMLDOMNamedNodeMap
Dim objAttributeNode As IXMLDOMNode
Dim intIndex As Integer
Dim tvwElement As Node
Dim tvwChildElement As Node
We need to add a title to the current TreeView node, by adding the element's main child node ("NAME") as the heading for the TreeView of this section. We use the method selectSingleNode to return the first node that it finds with the nodeName of "NAME".
Set objNameNode = objDOMNode.selectSingleNode("NAME")
Add the "NAME" element's parent node nodeName and its value to the TreeView.
Set tvwElement = tvwPeople.Nodes.Add(1, tvwChild)
tvwElement.Text = objNameNode.parentNode.nodeName & ": " _
& objNameNode.nodeTypedValue
Add the ID of the node to the TreeView node's tag property. The element node holds the ID attribute that we want to store in the tag, as an identity reference. We therefore need to get hold of this node to get its value.
Set objAttributes = objDOMNode.Attributes
We know that we've named our ID reference as "PERSONID", therefore we ask the NameNodeListMap to get this node by using the getNamedItem method. But firstly, we need to check if there are attributes:
If objAttributes.length > 0 Then
Set objAttributeNode = objAttributes.getNamedItem("PERSONID")
Store this value in the tag of the TreeView.
tvwElement.Tag = objAttributeNode.nodeValue
End If
tvwElement.EnsureVisible
intIndex = tvwElement.Index
We then need to populate the current TreeView node from the current DOM element node (objDOMNode) by once again iterating through the childNodes of the objDOMNode.
For Each objPersonElement In objDOMNode.childNodes
Set tvwChildElement = tvwPeople.Nodes.Add(intIndex,
tvwChild)
tvwChildElement.Text = objPersonElement.nodeTypedValue
Next
End Sub
This completes our population of the TreeView from the DOMDocument.
Populating the Text Boxes from the DOMDocument
When the user clicks on the TreeView, we need to fetch data from the DOMDocument to populate the text boxes on the right-hand side of the screen.
Private Sub tvwPeople_Click()
Dim objSelNode As Node
A little bit of VB checking for unwanted event firing--if this code has just come from the expand and collapse event, then ignore it.
If m_blnItemClicked = True Then
m_blnItemClicked = False
Exit Sub
End If
Set objSelNode = tvwPeople.SelectedItem
Call the procedure that handles populating the text boxes.
populatePeopleDetails objSelNode
End Sub
Private Sub populatePeopleDetails(objSelNode As Node)
This procedure populates the form's text boxes with details from the passed Node object (objSelNode). The parameter objSelNode is the current TreeView node, which the user has clicked on.
Dim objPersonElement As IXMLDOMElement
Dim objChildElement As IXMLDOMElement
If objSelNode Is Nothing Then Exit Sub
Ignore this TreeView selection if it has not been the "PERSON" node that has been clicked on:
If Trim(objSelNode.Tag) <> "" Then
We need to find the node in the DOMDocument that corresponds to the TreeView item the user has clicked:
Set objPersonElement =
m_objDOMPeople.nodeFromID(objSelNode.Tag)
lblElement.Caption = objPersonElement.nodeName & ": ID = " & _
objPersonElement.Attributes(0).nodeValue
With this found node (objPersonElement), once again we are going to iterate through its childNodes to populate the text boxes, by using a "Select Case" to filter which element we are dealing with.
For Each objChildElement In objPersonElement.childNodes
Check that the type of node you are dealing with is an element node, as it could also be other types of nodes, for example PIs. Then populate the text box with the text from the DOM node.
If objChildElement.nodeType = NODE_ELEMENT Then
Select Case UCase(objChildElement.nodeName)
Case "NAME"
txtName.Text = objChildElement.nodeTypedValue
Case "ADDRESS"
txtAddress.Text = objChildElement.nodeTypedValue
Case "TEL"
txtTel.Text = objChildElement.nodeTypedValue
Case "FAX"
txtFax.Text = objChildElement.nodeTypedValue
Case "EMAIL"
txtEmail.Text = objChildElement.nodeTypedValue
End Select
End If
Next objChildElement
End If
Clear the object memory resources.
Set objChildElement = Nothing
Set objPersonElement = Nothing
End Sub
Adding a New Person to the DOMDocument
Here the user will fill the text boxes on the right-hand side with the details for the new person: these then need to be saved to the DOMDocument.
This is done in two parts: firstly, clearing all the text boxes, then saving the content. We know this is not "cool" coding but it's just for ease of coding to show the basics of working with the DOMDocument.
Private Sub cmdClear_Click()
This clears the contents of the text boxes.
lblElement.Caption = ""
txtName.Text = ""
txtAddress.Text = ""
txtTel.Text = ""
txtFax.Text = ""
txtEmail.Text = ""
cmdAdd.Enabled = True
End Sub
After filling in the text boxes, the user clicks on the "Save New Person", which fires cmdAdd to save the new "Person" to the DOMDocument.
Private Sub cmdAdd_Click()
saveNewPerson
cmdAdd.Enabled = False
End Sub
Private Sub saveNewPerson()
This method creates a new element from the user input, and adds it and its children to the DOM object.
Dim objPerson As IXMLDOMElement
Dim objNewChild As IXMLDOMElement
Firstly we need to create a new "PERSON" child node (objPerson) for the documentElement node ("PEOPLE").
Set objPerson = m_objDOMPeople.createElement("PERSON")
Before continuing, we also need to give this "PERSON" node a "PERSONID" attribute--here we use our getNewID method to return this ID string. We need to do this as the attribute has been set up as "required" in the DTD. We use the setAttribute method to set this attribute value and add the attribute to the element node.
objPerson.setAttribute "PERSONID", getNewID
m_objDOMPeople.documentElement.appendChild objPerson
To this new "PERSON" child we must add its childNodes ("NAME", "ADDRESS", etc). We have chosen to use the createElement method to add the childNodes. However, you can choose to use the createNode method; which you use is just a matter of preference.
Set objNewChild = m_objDOMPeople.createElement("NAME").
objNewChild.Text = txtName.Text
Each childNode needs to be appended to the current "PERSON" node.
objPerson.appendChild objNewChild
Set objNewChild = m_objDOMPeople.createElement("ADDRESS")
objNewChild.Text = txtAddress.Text
objPerson.appendChild objNewChild
Set objNewChild = m_objDOMPeople.createElement("TEL")
objNewChild.Text = txtTel.Text
objPerson.appendChild objNewChild
Set objNewChild = m_objDOMPeople.createElement("FAX")
objNewChild.Text = txtFax.Text
objPerson.appendChild objNewChild
Set objNewChild = m_objDOMPeople.createElement("EMAIL")
objNewChild.Text = txtEmail.Text
objPerson.appendChild objNewChild
We reuse the populateTreeWithChildren method to keep the TreeView in sync with the changes to the DOMDocument.
populateTreeWithChildren objPerson
The new element may be added to the DOMDocument, but it is not added to the source XML until you save the document.
m_objDOMPeople.save XMLPATH
webTarget.Refresh
Set objPerson = Nothing
Set objNewChild = Nothing
End Sub
Deleting a Person from the DOMDocument
Clicking the "Delete" button allows the user to delete a node from the DOMDocument.
Private Sub cmdDelete_Click()
deleteSelectedPerson tvwPeople.SelectedItem
End Sub
Private Sub deleteSelectedPerson(objSelNode As Node)
The following code deletes the selected TreeView view node, as well as the corresponding item from the DOMDocument. The parameter objSelNode is the TreeView node that has been clicked on.
Dim objPersonElement As IXMLDOMNode
If no tree node has been selected, then exit the procedure.
If objSelNode Is Nothing Then Exit Sub
Once again, we need to find the ID stored in the selected TreeView node, or its parent node tag property, to be able to delete it.
If Trim(objSelNode.Tag) = "" Then
If Trim(objSelNode.Parent.Tag) <> "" Then
Set objSelNode = objSelNode.Parent
End If
End If
Once we have found this ID, use the nodeFromID method of the DOMDocument to return the DOMDocument node that needs to be deleted. Find the DOMDocument node, using the tag value found in the selected node of TreeView tag property.
If Trim(objSelNode.Tag) <> "" Then
Set objPersonElement =
m_objDOMPeople.nodeFromID(objSelNode.Tag)
If the DOMDocument node exists, use the removeChild method that can be accessed via the DOMDocument's documentElement property.
m_objDOMPeople.documentElement.removeChild objPersonElement
We need to save the DOMDocument, to reflect the removal of the childNode. The TreeView also needs to be kept in sync with this deletion.
m_objDOMPeople.save XMLPATH
tvwPeople.Nodes.Remove objSelNode.Index
webTarget.Refresh
End If
End Sub
Conclusion
In this example, techniques for working with the DOMDocument covered included the following:
- The fundamentals of loading an XML document into your program
- What to watch out for when you're using a DTD or Schema to validate your document
- Populating a TreeView control with the same hierarchy found in the DOMDocument
- When clicking on the TreeView control, how to find an item in the DOMDocument, based on a value stored in the tag of the selected TreeView item
- How to add a new item to an XML file, using the createElement and appendChild methods
- How to delete an item from the XML file, using the removeChild method
- How to save the document using the save method after making changes in the DOMDocument
Here are a few suggestions for expanding the application, and hints on doing a bit of experimentation to broaden your knowledge of working with the DOM:
- Update the document when the user has changed an item.
- In the populateTreeWithChildren procedure, we have specifically shown you how to use the IXMLDOMNamedNodeMap interface for iterating through attributes. You can change this procedure to use the getAttributeNode or getAttribute methods.
- In the section, "Adding a New Person to the DOMDocument," instead of using setAttribute for adding an attribute to the element, have a look at using the createAttribute method on the DOMDocument object and then using the setNamedItem method to add it to its element's attribute property.
For more information on using XML and Visual Basic visit VBXML.com, run by the authors of this article.
Copyright 1999 Manning Publications Company. Reprinted with permission.