Monday, January 16, 2006

MSXML XSLT processor in JScript

Here is XSLT transformation engine for Windows in just 4k of JScript code using MSXML SDK. Copy the snippet below and paste into mbxsl.wsf file then launch it to get usage help.

MSXML is probably the fastest and the worst engine at the time being. Do not ask why - it's empirical. You can try to catch that feeling by getting Windows Scripting Host 5.6 SDK, MS XML 4.0 SDK and making a simple XSLT transformation engine that should output text result in specific encoding like windows-1251. There are several ways to make the engine and only one to make it work as expected. Ok, let's finish this fast:

.transformNode() always returns UTF-16 string (no way to convert/iconv it while writing)
.transformNodeToObject() requires output to be well-formed XML if the output is DOMDocument. I didn't find anything else in these SDK's to substitute in this field, so no luck with plain text output. You can use IStream interface described below, but you will not find neither IStream interface nor it's ADODB.Stream implementation reference in these SDKs. So, I didn't know anything about ADODB.Stream and thought there must be another method described in SDKs to do the task. I've found IXSLProcessor interface, which, unfortunately, doesn't allow me to save "encoding" header in xml declaration along with (correctly, btw) encoded data via opened TextStream. Even though I didn't need that header to output plain text at the first time, later it turned into a problem. How much did you understood so far? Consider how many garbage had filtered through my head before I came up with the solution.. Ok, after some google cache data mining on IXSLProcessor+IStream I've found Rob Shields page, which fortunately contained an example of IStream implementation. I probably stop for now and post a solution "how to implement a binary XSLT transformation engine in JScript with MSXML".

If you need to make correct XSLT tranformation in MSXML with correct encoding specified in xsl:output try this:
cscript mbxsl.wsf /xml:in.xml /xsl:filter.xsl /out:result.out

---cut----[mbxsl.wsf]

<?xml version="1.0" encoding="windows-1251"?>
<package xmlns="uri:wsf">
<job id="mbxsl">
<?job error="true" debug="false"?>
<runtime>
<description>
XSLT engine implementation in MS XML using Windows Scripting Host
by Max Belugin and anatoly techtonik
http://farplugins.svn.sourceforge.net/viewvc/farplugins/trunk/plugbase/
</description>
<named
name ="xml"
helpstring ="source xml"
type ="string"
required ="true"
/>
<named
name ="xsl"
helpstring ="template xsl"
type ="string"
required ="true"
/>
<named
name ="out"
helpstring ="output file/folder"
type ="string"
required ="false"
/>

<named
name ="split"
helpstring ="if true (+) then split output to files and place in "
type ="boolean"
required ="false"
/>
<example>
cscript //nologo mbxsl.wsf /xml:navy.hrd /xsl:hrd2css.xsl
</example>
</runtime>

<script language="JScript">
<![CDATA[
var xmlDoc // as DOMDocument
var xslDoc // as FreeThreadedDOMDocument (required by XSLTemplate)
var xslObj // as XSLTemplate
var xslProc // as XSLProcessor
var targetDoc // as DOMDocument
var args // as WshNamed
=WScript.Arguments.Named;
if(args.Exists("xml")&&args.Exists("xsl")){
// WScript.Echo(args("xml"));
// WScript.Echo(args("xsl"));
xmlDoc=new ActiveXObject("MSXML2.DOMDocument.4.0")
xslDoc=new ActiveXObject("MSXML2.FreeThreadedDOMDocument.4.0")
xmlDoc.async=false;
xslDoc.async=false;
xmlDoc.validateOnParse=false;
xslDoc.validateOnParse=false;
xmlDoc.load(args("xml"));
xslDoc.load(args("xsl"));
if (xmlDoc.parseError.errorCode != 0)
WScript.echo("XML parse error: " + "line " + xmlDoc.parseError.line +
" pos " + xmlDoc.parseError.linepos + " code " +
xmlDoc.parseError.errorCode + "\n" + xmlDoc.parseError.reason);
if (xslDoc.parseError.errorCode != 0)
WScript.echo("XSL parse error: " + "line " + xmlDoc.parseError.line +
" pos " + xmlDoc.parseError.linepos + " code "+
xmlDoc.parseError.errorCode + "\n" + xmlDoc.parseError.reason);

if(args("split")){
targetDoc=new ActiveXObject("MSXML2.DOMDocument.4.0")
targetDoc.async=false;
//targetDoc.preserveWhiteSpace=true;
xmlDoc.transformNodeToObject(xslDoc, targetDoc);
if(args.Exists("out")){
var fs // as FileSystemObject
=new ActiveXObject("Scripting.FileSystemObject");
var out // as Folder
=fs.GetFolder(args("out"));
var files // as IXMLDOMNodeList
=targetDoc.selectNodes("/files/file");
var i=new Enumerator(files);
for (;!i.atEnd();i.moveNext()){
var file// as IXMLDOMNode
=i.item();
var fileName=file.getAttribute("name");
//var outPath=fs.BuildPath(out.Path, fileName);
var outFile=out.CreateTextFile(fileName);
outFile.write(file.firstChild.xml);
outFile.close();
};

}else
WScript.echo(targetDoc.xml);
}else{
xslObj=new ActiveXObject("MSXML2.XSLTemplate.4.0")
xslObj.stylesheet=xslDoc;
xslProc=xslObj.createProcessor();
xslProc.input = xmlDoc;
// IStream is necessary to avoid conversion from to UTF-16 and
// losing encoding="windows-1251" attribute from xml declaration
if(!args.Exists("out")){
if (!xslProc.transform())
WScript.echo("transformation error");
WScript.echo(xslProc.output);
}else{
var oStream = new ActiveXObject ("ADODB.Stream");
oStream.Mode = 3; // adModeReadWrite
oStream.Type = 1; // adTypebinary
oStream.Open();
xslProc.output = oStream;
if (!xslProc.transform())
WScript.echo("transformatin error");
oStream.saveToFile(args("out"), 2); // adSaveCreateOverWrite
oStream.close();
}
};
} else {
WScript.Arguments.ShowUsage();
};
]]>
</script>
</job>
</package>

---cut----[mbxsl.wsf]

Problem description.
[1] Encoding issues using the MS XSLT engine
[2] transformNodeToObject method error

Tnx Rob Shields for ADODB.Stream usage example

No comments:

Post a Comment