Columns

Product Review: XML in the vortex: A review of VorteXML Designer V3

VorteXML Designer V3
Datawatch Corp.
Lowell, Mass.
978-441-2200
www.datawatch.com
Rating: 3 out of 5

The pervasiveness of and reliance on XML in mission-critical business apps requires non-XML documents to be reformatted for integration into more modern systems. Datawatch's VorteXML Designer is a platform for visually transforming text documents into XML, as well as mapping that XML to a Schema.

After spending some time with Version 3.0 of VorteXML Designer, my impression is that Datawatch followed the 'Do one thing, and do it well' design philosophy. And it does allow developers or savvy non-developers to transform text into XML. But the vendor left out features that could have taken the product to the next level.

VorteXML Designer breaks down the task of transforming a text file into valid XML into a three-step process: defining which portions of the file's data you want to capture, refining the hierarchical organization of the captured data, and mapping your XML output to a Schema document.

Two modes of operation are at work in the product. Input mode encompasses defining and refining the data to be exported as XML. In Output mode the XML output is mapped to a Schema document. The tool provides a hierarchical view of the data being worked with in the Tree Pane. In Input mode, the Tree Pane lists the organization of the data sets you have highlighted for export. In Output mode, it provides a view of the schema paradigm.

Identifying which parts of the document to transfer to XML is done using Traps -- a pattern within the document that the parser can repeatedly identify such as the type of data, the position or formation of the text in the document, or a recognizable pattern within the text. Each Trap tool performed flawlessly during testing. Once you have identified information to be captured by a Trap, the product propagates the pattern throughout the document, highlighting portions of the file that will be captured by that same Trap. I was impressed with its accuracy at pinpointing identical data sets throughout the document.

A verification tool ensures that your trap will consistently capture data throughout your file. A dialog box walks you through each problem area, and lets you modify your trap accordingly. This is especially useful for long documents. Data sets identified through the Trap process are called Templates. Each of the defined Templates appears in the Tree Pane. A group of Templates for a document is saved as a Profile.

VorteXML Designer provides two distinct viewpoints for previewing data selected in the trap process: Report and Table. Report is the default view in Input mode. This is a rendering of the file you are working with. Information identified for capture is highlighted in Report view. Table view is a spreadsheet-style layout of the captured data from the file.

You can dynamically alter, modify, transform or calculate data by applying Filters within your Templates. Accessible from within Table view, you can apply any number of relevant filters to data.

After locating the relevant information in your document, you must order the relationships of each Template so that the XML document has a logical structure. Hierarchical relationships can be modified visually by promoting or demoting Templates within the Tree Pane.

You can also specify the number of occurrences of trapped data in the resulting XML output. VorteXML will judge occurrences during the trapping process, but you can override these at any time.

Then you can begin to apply a schema to the resulting XML output in Designer's Output mode. The product supports three XML schema types: XDR, DTD and XSD. Schema files can be accessed locally or remotely via HTTP. If you aren't using an industry-standard schema or your own schema, VorteXML can automatically generate valid XSD schema from your defined Templates. This feature is critical; without automatic schema generation, a non-developer would be dependent on an outside source or another software tool for the final piece of the XML puzzle.

After loading a schema, VorteXML opens the Schema Log window, which identifies elements within your data that are required to be mapped, according to the schema definition.

In Output mode, the schema's hierarchical structure replaces the Template list in the Tree Pane. Although the Tree Pane provides an outline of your schema, you can't view or edit the schema document within VorteXML. That requires an external editor, like Altova's XMLSpy.

To map the document to the schema, you must walk through the schema structure item-by-item and identify which Template and field to map to the selected element in the schema. The process is rote but straightforward. The resulting XML output can be previewed at any point in the Output process.

In the mapping dialog, there is an option to map the selected template field to a schema's text element or attribute. VorteXML Designer did an excellent job of automatically determining the correct base data type for each Template's field.

The product provides a Repetition tool that allows you to define the number of times your mapped Template should occur within an XML document; it allows you to create a separate XML document for each occurrence of a field.

After mapping, you can view the resulting XML document(s) output under the Browse tab in Output mode. If you are generating multiple XML documents, you can scroll through these using the pagination buttons in the Output mode toolbar. VorteXML also gives you the option of validating the XML output from your mapped document, as well as applying an XSL or CSS stylesheet.

I do have some criticisms about VorteXML Designer. It can only work natively with ASCII files. (Datawatch lists the compatible file types as text, log files, HTML and ASP; I was also able to work with PHP, JSP, e-mails and other text-based files without problems.) Though it is simple to re-save your Microsoft Office or Open Office documents in a text format, what if you had hundreds of such documents? VorteXML Designer's ability to natively parse formats such as MS Word and Adobe PDF becomes a necessity. The APIs for these file types are available, so I would hope to see native support for them in future versions.

Also lacking was a feature that would let me trigger VorteXML Designer to apply defined Profiles automatically to every file in a selected directory. The current version requires you to open every file individually and manually export XML for each one.

Last but not least is the lack of platform availability. VorteXML Designer is a Windows-only app. Offering versions for one or more of the popular desktop Linux distros, Apple's OSX and Sun's Solaris would be welcome.

Bottom Line: VorteXML Designer V3 is a capable tool with solid functionality, but it is missing the polish and some features that would make it truly excellent.

More information is available at the Datawatch Web site: http://www.datawatch.com/

Pricing and Availability : Datawatch's VorteXML Designer V3 is a platform for visually transforming text documents into XML, as well as mapping that XML to a Schema.

Price : $599

Pros:

  • Visual interface allows VorteXML Designer to be used by developers and non-developers/business analysts. 
  • Supports major XML schema types; XSD support in V3. 
  • Can automatically generate valid XSD schema. 
  • Tight integration between the VorteXML Designer client application and the VorteXML Server product.

Cons:

  • Some commonly used document types, such as Microsoft Word and Adobe PDF, are not supported. 
  • Lacks the ability to view/edit loaded or generated schema documents. 
  • No method of automating XML gen for quantities of files. 
  • Only available for Windows.

About the Author

Jason Halla is an enterprise J2EE architect with a Fortune 500 company in Indianapolis, and moderator of Devshed's popular Java, PHP and XML forums. He can be reached at [email protected].