Learn How To Use PowerShell To Parse XML, Read, and Validate

The article discusses how PowerShell can handle XML data by using built-in cmdlets and .NET classes. It covers reading, parsing, and validating XML documents and best practices for error handling and performance.

By the end of the article, readers will better understand working with PowerShell XML and creating scripts to handle complex XML structures.

Requirements to Parse a PowerShell XML file

You will only need two platforms on your Windows PC to parse, read and validate an XML – 

  • Windows PowerShell; Any version older than 3.0 can be used. We will be using version 5.0 for reference.
  • A text editor that can process XML; We recommend Visual Studio Code or Notepad++ for this. 

XML Elements and Select-Xml

Select-Xml is a powerful cmdlet in PowerShell that allows you to search for specific XML elements within an XML document and return their values. To use Select-Xml, you need to specify the XPath query for the XML elements you want to search for. For example, if you have an XML file with a <book> element that contains a <title> element, you can use the following code to extract the title:

$xml = [xml](Get-Content "C:\books.xml") Select-Xml -Xml $xml -XPath "//book/title" | ForEach-Object { $_.Node.InnerText }

In this example, we first use the [xml] type accelerator to convert the XML file into an XML object, and then we use Select-Xml to search for all <title> elements within <book> elements. 

You can also use Select-Xml to search for elements based on their attributes. For example, if you have an XML file with a <book> element that has a "genre" attribute, you can use the following code to extract all books with a genre of "fiction":

Select-Xml -Xml $xml -XPath "//book[@genre='fiction']" | ForEach-Object { $_.Node.InnerText }

In this example, we use the [@attribute='value'] syntax to search for all <book> elements with a "genre" attribute equal to "fiction".

Select-Xml is a powerful tool for parsing and extracting data from XML documents in PowerShell. By using XPath queries, you can quickly and easily locate the XML elements you need and extract their values for further processing.

Parsing XML Attributes through Select-Xml

In addition to parsing XML elements, Select-Xml can also be used to extract data from XML attributes. To extract XML attributes using Select-Xml, you need to specify the attribute name in the XPath query.

For example, if you have an XML file with a <book> element that has a "price" attribute, you can use the following code to extract the price:

$xml = [xml](Get-Content "C:\books.xml") Select-Xml -Xml $xml -XPath "//book/@price" | ForEach-Object { $_.Node.Value }

In this example, we use the "@" symbol followed by the attribute name to specify the "price" attribute. The results are piped to the ForEach-Object cmdlet, which extracts the value of the attribute.

You can also use Select-Xml to search for elements based on their attributes and then extract the values of other attributes. For example, if you have an XML file with a <book> element that has "author" and "title" attributes, you can use the following code to extract the title of all books written by a specific author:

Select-Xml -Xml $xml -XPath "//book[@author='J.K. Rowling']/@title" | ForEach-Object { $_.Node.Value }

In this example, we use the [@attribute='value'] syntax to search for all <book> elements with an "author" attribute equal to "J.K. Rowling", and then extract the "title" attribute values of these elements.

Select-Xml provides a powerful and flexible way to parse XML attributes in PowerShell. By specifying the attribute name in the XPath query, you can quickly and easily extract specific data from XML elements.

Registering XML Strings as Objects

In PowerShell, you can register XML strings as objects using the [xml] type accelerator. This allows you to easily work with XML data as objects and access their properties and methods.

To register an XML string as an object, you can simply cast the string to the [xml] type accelerator. For example, suppose you have an XML string representing a book:

$xmlString = @" <book> <title>PowerShell in Action</title> <author>Bruce Payette</author> <publisher>Manning Publications</publisher> <year>2011</year> </book> "@

You can register this XML string as an object using the following code:

$xmlObject = [xml]$xmlString

Now you can access the properties and methods of the $xmlObject as you would with any other PowerShell object. For example, you can access the book title using the following code:

$xmlObject.book.title

You can also use Select-Xml to query the XML object and extract specific elements or attributes. For example, to extract the book title using Select-Xml, you can use the following code:

Select-Xml -Xml $xmlObject -XPath "//book/title" | ForEach-Object { $_.Node.InnerText }

In this example, we use the -Xml parameter to specify the XML object, and then use the XPath query "//book/title" to extract the title element. The results are piped to the ForEach-Object cmdlet to extract the inner text of the title element.

How to read XML Object Elements?

Reading XML object elements in PowerShell can be done in various ways depending on the complexity of the XML structure. One way to read the XML elements is by using the dot notation to access the child elements.

For example, suppose you have an XML file with the following structure:

<books> <book> <title>Harry Potter and the Philosopher's Stone</title> <author>J.K. Rowling</author> <year>1997</year> </book> <book> <title>The Great Gatsby</title> <author>F. Scott Fitzgerald</author> <year>1925</year> </book> </books>

You can load this XML file into PowerShell using the Get-Content cmdlet and the Select-Xml cmdlet to parse the XML elements:

$xml = Get-Content "C:\books.xml" $books = Select-Xml -Xml $xml -XPath "//book" foreach ($book in $books) { Write-Output "Title: $($book.Node.title)" Write-Output "Author: $($book.Node.author)" Write-Output "Year: $($book.Node.year)" Write-Output "" }

In the above code, the Select-Xml cmdlet is used to select all the book elements in the XML file, and the foreach loop is used to iterate over each book element. The dot notation is used to access the child elements of the book element and display their values using the Write-Output cmdlet.

This is just one way to read XML object elements in PowerShell, and there are many other techniques and cmdlets available depending on your specific needs and requirements.

How to read XML Attributes?

To read XML attributes in PowerShell, you can use the Select-Xml cmdlet in combination with XPath expressions. XPath is a language used to select specific parts of an XML document.

To read XML attributes, you can use the "@" symbol followed by the attribute name in the XPath expression. For example, if you have an XML element with an attribute called "id", you can select it using the following XPath expression:

Select-Xml -Path "example.xml" -XPath "//element/@id"

This will return a collection of objects representing the "id" attributes of all "element" elements in the XML file.

You can also use the Value property of the returned object to access the value of the attribute, like so:

$xml = Select-Xml -Path "example.xml" -XPath "//element/@id" $xml.Node.Value

This will return the value of the "id" attribute for the first "element" element in the XML file.

In addition to Select-Xml, you can also use other cmdlets and .NET classes like Get-Content, XmlReader, and XmlDocument to read XML attributes in PowerShell.

Repeating through XML Data

To iterate XML data in PowerShell, you can use a combination of cmdlets and .NET classes like Get-Content, Select-Xml, XmlReader, and XmlDocument.

Here's an example of how to iterate through all the elements in an XML file using the Select-Xml cmdlet:

$xml = Select-Xml -Path "example.xml" -XPath "//element" foreach ($element in $xml) 
{ # Do something with each element $element.Node.Name }

This will iterate through all the "element" elements in the XML file and perform some action on each element. In this case, it simply prints the name of each element.

Alternatively, you can use the XmlReader class to iterate through XML data using a while loop:

$reader = [System.Xml.XmlReader]::Create("example.xml") while ($reader.Read()) { if ($reader.NodeType -eq [System.Xml.XmlNodeType]::Element) { # Do something with each element $reader.Name } }

This code creates an XmlReader object to read the XML data from the file "example.xml". It then uses a while loop to iterate through the XML data one node at a time. When it encounters an element node, it performs some action on it. In this case, it simply prints the name of each element.

XML Schemas using PowerShell

In PowerShell, you can create and validate XML documents against XML schemas using the built-in .NET classes. An XML schema is a formal description of the structure and content of an XML document. It defines the elements, attributes, and data types that are allowed in the document.

To create an XML schema in PowerShell, you can use the XmlSchema class from the System.Xml.Schema namespace. The XmlSchema class provides properties and methods to define elements, attributes, data types, and other schema components.

Here's an example of how to create a simple XML schema in PowerShell:

# Import the System.Xml.Schema namespace 
using namespace System.Xml.Schema 

# Create a new schema object 
$schema = New-Object XmlSchema 

# Define an element 
$element = New-Object XmlSchemaElement $element.Name = "book" $element.SchemaTypeName = [XmlSchemaType]::GetBuiltInSimpleType([XmlTypeCode]::String) $schema.Items.Add($element) 

# Define an attribute 
$attribute = New-Object XmlSchemaAttribute $attribute.Name = "isbn" $attribute.SchemaTypeName = [XmlSchemaType]::GetBuiltInSimpleType([XmlTypeCode]::String) $element.Attributes.Add($attribute) 

# Save the schema to a file 
$schema.Write("schema.xsd")

This code creates a schema with a single element called "book" and an attribute called "isbn". The schema is saved to a file called "schema.xsd".

To validate an XML document against a schema in PowerShell, you can use the XmlReaderSettings class from the System.Xml namespace. The XmlReaderSettings class provides properties to set the validation type, schema location, and other validation options.

Here's an example of how to validate an XML document against the schema created above:

# Import the System.Xml namespace 
using namespace System.Xml 

# Create a new XmlReaderSettings object $settings = New-Object XmlReaderSettings $settings.ValidationType = [ValidationType]::Schema $settings.Schemas.Add($null, "schema.xsd") 

# Create a new XmlReader object with the settings 
$reader = [XmlReader]::Create("books.xml", $settings) 

# Validate the XML document 
while ($reader.Read()) {}

This code validates an XML document called "books.xml" against the schema created above. The XmlReaderSettings object is configured to use schema validation and to locate the schema file "schema.xsd". The XmlReader object is created with the settings and the document is validated by reading through it with the while loop.

Overall, using PowerShell to create and validate XML schemas can be a powerful tool for ensuring the correctness and consistency of your XML data.

By now, you should be able to use PowerShell to parse, read and validate XML files on your PC. For further queries, you can get this information from the Microsoft PowerShell webpage.

Meet the Author

Abdul Rahim has been working in Information Technology for over two decades. Learn how Abdul got his start as a Tech Blogger , and why he decided to start this Software blog. If you want to send Abdul a quick message, then visit his contact page here.