Libxslt is the XSLTC librarydeveloped for the Gnome project. XSLT itself is a an XML language to definetransformation for XML. Libxslt is based on libxml2the XML C library developed for theGnome project. It also implements most of the EXSLTset of processor-portable extensionsfunctions and some of Saxon's evaluate and expressions extensions.
People can either embed the library in their application or use xsltprocthe command line processing tool. This library is free software and can bereused in commercial applications (see the intro)
External documents:
Logo designed by Marc Liyanage.
This document describes libxslt,the XSLTC library developed for theGnomeproject.
Here are some key points about libxslt:
There are some on-line resources about using libxslt:
If you need help with the XSLT language itself, here are a number ofuseful resources:
Well, bugs or missing features are always possible, and I will make apoint of fixing them in a timely fashion. The best way to report a bug is touse the Gnome bugtracking database(make sure to use the "libxslt" module name). Beforefiling a bug, check the list of existinglibxslt bugsto make sure it hasn't already been filed. I look at reportsthere regularly and it's good to have a reminder when a bug is still open. Besure to specify that the bug is for the package libxslt.
For small problems you can try to get help on IRC, the #xml channel onirc.gnome.org (port 6667) usually have a few person subscribed which may help(but there is no garantee and if a real issue is raised it should go on themailing-list for archival).
There is also a mailing-list xslt@gnome.orgfor libxslt, with an on-line archive. To subscribeto this list, please visit the associated Webpageand follow the instructions.
Alternatively, you can just send the bug to the xslt@gnome.orglist, if it's really libxsltrelated I will approve it.. Please do not send me mail directly especiallyfor portability problem, it makes things really harder to track and in somecases I'm not the best person to answer a given question, ask the listinstead. Do not send code, I won't debug it(but patches arereally appreciated!).
Please note that with the current amount of virus and SPAM, sending mailto the list without being subscribed won't work. There is *far too manybounces* (in the order of a thousand a day !) I cannot approve them manuallyanymore. If your mail to the list bounced waiting for administrator approval,it is LOST ! Repost it and fix the problem triggering the error. Also pleasenote that emails witha legal warning asking to not copy or redistribute freely the informationsthey containare NOTacceptable for the mailing-list,such mail will as much as possible be discarded automatically, and are lesslikely to be answered if they made it to the list, DO NOTpost to the list from an email address where such legal requirements areautomatically added, get private paying support if you can't shareinformations.
Check the following too beforeposting:
Then send the bug with associated informations to reproduce it to the xslt@gnome.orglist; if it's really libxsltrelated I will approve it. Please do not send mail to me directly, it makesthings really hard to track and in some cases I am not the best person toanswer a given question, ask on the list.
To be really clear about support:
Of course, bugs reports with a suggested patch for fixing them willprobably be processed faster.
If you're looking for help, a quick look at the list archivemay actuallyprovide the answer, I usually send source samples when answering libxsltusage questions. The auto-generated documentationisnot as polished as I would like (I need to learn more about Docbook), butit's a good starting point.
You can help the project in various ways, the best thing to do first is tosubscribe to the mailing-list as explained before, check the archives and the Gnome bugdatabase::
The latest versions of libxslt can be found on the xmlsoft.orgserver and on mirrors (France) or on the Gnome FTP serveras asourcearchive, Antonin Sprinzl also provides a mirror in Austria. (NOTE thatyou need the libxml2,libxml2-devel,libxsltand libxslt-develpackages installed to compile applications using libxslt.) Igor Zlatkovicis now the maintainer ofthe Windows port, he providesbinaries. Gary Penningtonprovides Solaris binaries.Steve Ballprovides Mac Os Xbinaries.
I do accept external contributions, especially if compiling on anotherplatform, get in touch with me to upload the package. I will keep them in thecontrib directory
Libxslt is also available from CVS:
The Gnome CVSbase. Check the Gnome CVS Toolspage; the CVS module is libxslt.
Usually the problem comes from the fact that the compiler doesn't
    getthe right compilation or linking flags. There is a small shell
    scriptxslt-configwhich is installed as part of libxslt
    usualinstall process which provides those flags. Use
xslt-config --cflags
to get the compilation flags and
xslt-config --libs
to get the linker flags. Usually this is done directly from theMakefile as:
CFLAGS=`xslt-config --cflags`
LIBS=`xslt-config --libs`
Note also that if you use the EXSLT extensions from the program
    thenyou should prepend -lexsltto the LIBS options
xsltproc --param test alpha foo.xsl foo.xml
the param does not get passed and ends up as ""
In a nutshell do a double escaping at the shell prompt:
xsltproc --param test "'alpha'" foo.xsl foo.xml
i.e. the string value is surrounded by " and ' then terminated by 'and
    ". Libxslt interpret the parameter values as XPath expressions, sothe
    string ->alpha<- is intepreted as the node setmatching
    this string. You really want ->'alpha'<- tobe passed
    to the processor. And to allow this you need to escape thequotes at the
    shell level using ->"'alpha'"<- .
or use
xsltproc --stringparam test alpha foo.xsl foo.xml
Yes for example xmlwrapp, see the related pages about bindings
The change logdescribes the recents commitsto the CVScode base.
Those are the public releases made:
This is a bugfix only release
speed of large text output, xsl:copy with attributes, strip-space andnamespaces prefix, fix for --path xsltproc option, EXST:tokenize (ShaunMcCance), EXSLT:seconds (William Brack), sort with multiple keys (WilliamBrack), checking of { and } for attribute value templates (WilliamBrack)
stylesheet compilation (Igor Zlatkovic), NaN and sort (William Brack),RVT bug introduced in 1.0.30
Mostly a bug fix release.
This program is the simplest way to use libxslt: from the command line. Itis also used for doing the regression tests of the library.
It takes as first argument the path or URL to an XSLT stylesheet, the nextarguments are filenames or URIs of the inputs to be processed. The output ofthe processing is redirected on the standard output. There is actually a fewmore options available:
orchis:~ -> xsltproc
Usage: xsltproc [options] stylesheet file [file ...]
   Options:
      --version or -V: show the version of libxml and libxslt used
      --verbose or -v: show logs of what's happening
      --output file or -o file: save to a given file
      --timing: display the time used
      --repeat: run the transformation 20 times
      --debug: dump the tree of the result instead
      --novalid: skip the Dtd loading phase
      --noout: do not dump the result
      --maxdepth val : increase the maximum depth
      --html: the input document is(are) an HTML file(s)
      --docbook: the input document is SGML docbook
      --param name value : pass a (parameter,value) pair
      --nonet refuse to fetch DTDs or entities over network
      --warnnet warn against fetching over the network
      --catalogs : use the catalogs from $SGML_CATALOG_FILES
      --xinclude : do XInclude processing on document intput
      --profile or --norman : dump profiling informations 
orchis:~ ->

DocBookis anXML/SGML vocabulary particularly well suited to books and papers aboutcomputer hardware and software.
xsltproc and libxslt are not specifically dependant on DocBook, but sincea lot of people use xsltproc and libxml2 for DocBook formatting, here are afew pointers and informations which may be helpful:
export XMLCATALOG=$HOME/xmlcatalog
should allow to process DocBook documentations without requiringnetwork accesses for the DTd or stylesheets
Do not use the --docbook option of xsltproc to process XML DocBookdocuments, this option is only intended to provide some (limited) support ofthe SGML version of DocBook.
Points which are not DocBook specific but still worth mentionningagain:
xmllint --valid --noout path_to_document
to make sure that your input is valid DocBook. And fixes the errorsbefore processing further. Note that XSLT processing may work correctlywith some forms of validity errors left, but in general it can givetroubles on output.
Okay this section is clearly incomplete. But integrating libxslt into yourapplication should be relatively easy. First check the few steps describedbelow, then for more detailed informations, look at thegenerated pagesfor the API and the sourceof libxslt/xsltproc.c and the tutorial.
Basically doing an XSLT transformation can be done in a few steps:
xmlSubstituteEntitiesDefault(1);
xmlLoadExtDtdDefaultValue = 1;
Steps 2,3, and 5 will probably need to be changed depending on youprocessing needs and environment for example if reading/saving from/tomemory, or if you want to apply XInclude processing to the stylesheet orinput documents.
There is a number of language bindings and wrappers available for libxml2,the list below is not exhaustive. Please contact the xml-bindings@gnome.org(archives) inorder to get updates to this list or to discuss the specific topic of libxml2or libxslt wrappers or bindings:
The libxslt Python module depends on the libxml2 Pythonmodule.
The distribution includes a set of Python bindings, which are garanteed tobe maintained as part of the library in the future, though the Pythoninterface have not yet reached the completeness of the C API.
Stéphane Bidoulmaintains a Windows portof the Python bindings.
Note to people interested in building bindings, the API is formalized asan XML API description filewhich allows toautomate a large part of the Python bindings, this includes functiondescriptions, enums, structures, typedefs, etc... The Python script used tobuild the bindings is python/generator.py in the source distribution.
To install the Python bindings there are 2 options:
The distribution includes a set of examples and regression tests for
thepython bindings in the python/testsdirectory. Here are
someexcepts from those tests:
This is a basic test of XSLT interfaces: loading a stylesheet and adocument, transforming the document and saving the result.
import libxml2
import libxslt
styledoc = libxml2.parseFile("test.xsl")
style = libxslt.parseStylesheetDoc(styledoc)
doc = libxml2.parseFile("test.xml")
result = style.applyStylesheet(doc, None)
style.saveResultToFilename("foo", result, 0)
style.freeStylesheet()
doc.freeDoc()
result.freeDoc()
The Python module is called libxslt, you will also need the libxml2 modulefor the operations on XML trees. Let's have a look at the objects manipulatedin that example and how is the processing done:
styledoc: is a libxml2 document tree. It is obtained
    byparsing the XML file "test.xsl" containing the stylesheet.style: this is a precompiled stylesheet ready to be usedby
    the following transformations (note the plural form,
    multipletransformations can resuse the same stylesheet).doc: this is the document to apply the transformation
    to.In this case it is simply generated by parsing it from a file but
    anyother processing is possible as long as one get a libxml2 Doc. Note
    thatHTML tree are suitable for XSLT processing in libxslt. This is
    actuallyhow this page is generated !result: this is a document generated by applying
    thestylesheet to the document. Note that some of the stylesheet
    informationsmay be related to the serialization of that document and as
    in thisexample a specific saveResultToFilename() method of the stylesheet
    shouldbe used to save it to a file (in that case to "foo").Also note the need to explicitely deallocate documents with freeDoc()except for the stylesheet document which is freed when its compiled form isgarbage collected.
This one is a far more complex test. It shows how to modify the behaviourof an XSLT transformation by passing parameters and how to extend the XSLTengine with functions defined in python:
import libxml2
import libxslt
import string
nodeName = None
def f(ctx, str):
    global nodeName
    #
    # Small check to verify the context is correcly accessed
    #
    try:
        pctxt = libxslt.xpathParserContext(_obj=ctx)
        ctxt = pctxt.context()
        tctxt = ctxt.transformContext()
        nodeName = tctxt.insertNode().name
    except:
        pass
    return string.upper(str)
libxslt.registerExtModuleFunction("foo", "http://example.com/foo", f)
This code defines and register an extension function. Note that thefunction can be bound to any name (foo) and how the binding is alsoassociated to a namespace name "http://example.com/foo". From an XSLT pointof view the function just returns an upper case version of the string passedas a parameter. But the first part of the function also read some contextualinformation from the current XSLT processing environement, in that case itlooks for the current insertion node in the resulting output (either theresulting document or the Result Value Tree being generated), and saves it toa global variable for checking that the access actually worked.
For more informations on the xpathParserContext and transformContextobjects check the libray internals description.The pctxt is actually an object from a class derived from thelibxml2.xpathParserContext() with just a couple more properties including thepossibility to look up the XSLT transformation context from the XPathcontext.
styledoc = libxml2.parseDoc("""
<xsl:stylesheet version='1.0'
  xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
  xmlns:foo='http://example.com/foo'
  xsl:exclude-result-prefixes='foo'>
  <xsl:param name='bar'>failure</xsl:param>
  <xsl:template match='/'>
    <article><xsl:value-of select='foo:foo($bar)'/></article>
  </xsl:template>
</xsl:stylesheet>
""")
Here is a simple example of how to read an XML document from a pythonstring with libxml2. Note how this stylesheet:
barstyle = libxslt.parseStylesheetDoc(styledoc)
doc = libxml2.parseDoc("<doc/>")
result = style.applyStylesheet(doc, { "bar": "'success'" })
style.freeStylesheet()
doc.freeDoc()
that part is identical, to the basic example except that thetransformation is passed a dictionnary of parameters. Note that the stringpassed "success" had to be quoted, otherwise it is interpreted as an XPathquery for the childs of root named "success".
root = result.children
if root.name != "article":
    print "Unexpected root node name"
    sys.exit(1)
if root.content != "SUCCESS":
    print "Unexpected root node content, extension function failed"
    sys.exit(1)
if nodeName != 'article':
    print "The function callback failed to access its context"
    sys.exit(1)
result.freeDoc()
That part just verifies that the transformation worked, that the parametergot properly passed to the engine, that the function f() got called and thatit properly accessed the context to find the name of the insertion node.
this module is a bit too long to be described there but it is basically arewrite of the xsltproc command line interface of libxslt in Python. Itprovides nearly all the functionalities of xsltproc and can be used as a basemodule to write Python customized XSLT processors. One of the thing to noticeare:
libxml2.lineNumbersDefault(1) libxml2.substituteEntitiesDefault(1)
those two calls in the main() function are needed to force the libxml2processor to generate DOM trees compliant with the XPath data model.
This document describes the processing of libxslt, the XSLTC library developed for the Gnomeproject.
Note: this documentation is by definition incomplete and I am not good atspelling, grammar, so patches and suggestions are really welcome.
XSLT is a transformation language. It takes an input document and astylesheet document and generates an output document:

Libxslt is written in C. It relies on libxml, the XML C library for Gnome, forthe following operations:
Libxslt is not very specialized. It is built under the assumption that allnodes from the source and output document can fit in the virtual memory ofthe system. There is a big trade-off there. It is fine for reasonably sizeddocuments but may not be suitable for large sets of data. The gain is that itcan be used in a relatively versatile way. The input or output may never beserialized, but the size of documents it can handle are limited by the sizeof the memory available.
More specialized memory handling approaches are possible, like buildingthe input tree from a serialization progressively as it is consumed,factoring repetitive patterns, or even on-the-fly generation of the output asthe input is parsed but it is possible only for a limited subset of thestylesheets. In general the implementation of libxslt follows the followingpattern:
The result is not that bad, clearly one can do a better job but morespecialized too. Most optimization like building the tree on-demand wouldneed serious changes to the libxml XPath framework. An easy step would be toserialize the output directly (or call a set of SAX-like output handler tokeep this a flexible interface) and hence avoid the memory consumption of theresult.
DOM-like trees, as used and generated by libxml and libxslt, arerelatively complex. Most node types follow the given structure except a fewvariations depending on the node type:

Nodes carry a nameand the node typeindicates the kind of node it represents, the most common ones are:
For the XSLT processing, entity nodes should not be generated (i.e. theyshould be replaced by their content). Most nodes also contains the following"navigation" informations:
Elements nodes carries the list of attributes in the properties, anattribute itself holds the navigation pointers and the children list (theattribute value is not represented as a simple string to allow usage ofentities references).
The nspoints to the namespace declaration for thenamespace associated to the node, nsDefis the linked listof namespace declaration present on element nodes.
Most nodes also carry an _privatepointer which can beused by the application to hold specific data on this node.
There are a few steps which are clearly decoupled at the interfacelevel:
A few things should be noted here:
This is the second step described. It takes a stylesheet tree, and"compiles" it. This associates to each node a structure stored in the_private field and containing information computed in the stylesheet:

One xsltStylesheet structure is generated per document parsed for thestylesheet. XSLT documents allow includes and imports of other documents,imports are stored in the importslist (hence keeping thetree hierarchy of includes which is very important for a proper XSLTprocessing model) and includes are stored in the doclistlist. An imported stylesheet has a parent link to allow browsing of thetree.
The DOM tree associated to the document is stored in doc.It is preprocessed to remove ignorable empty nodes and all the nodes in theXSLT namespace are subject to precomputing. This usually consist ofextracting all the context information from the context tree (attributes,namespaces, XPath expressions), and storing them in an xsltStylePreCompstructure associated to the _privatefield of the node.
A couple of notable exceptions to this are XSLT template nodes (more onthis later) and attribute value templates. If they are actually templates,the value cannot be computed at compilation time. (Some preprocessing couldbe done like isolation and preparsing of the XPath subexpressions but it'snot done, yet.)
The xsltStylePreComp structure also allows storing of the precompiled formof an XPath expression that can be associated to an XSLT element (more onthis later).
A proper handling of templates lookup is one of the keys of fast XSLTprocessing. (Given a node in the source document this is the process offinding which templates should be applied to this node.) Libxslt follows thehint suggested in the 5.2Patternssection of the XSLT Recommendation, i.e. it doesn't evaluate itas an XPath expression but tokenizes it and compiles it as a set of rules tobe evaluated on a candidate node. There usually is an indication of the nodename in the last step of this evaluation and this is used as a key check forthe match. As a result libxslt builds a relatively more complex set ofstructures for the templates:

Let's describe a bit more closely what is built. First the xsltStylesheetstructure holds a pointer to the template hash table. All the XSLT patternscompiled in this stylesheet are indexed by the value of the the targetelement (or attribute, pi ...) name, so when a element or an attribute "foo"needs to be processed the lookup is done using the name as a key.
Each of the patterns is compiled into an xsltCompMatch structure. It holdsthe set of rules based on the tokenization of the pattern stored in reverseorder (matching is easier this way). It also holds some information about theprevious matches used to speed up the process when one iterates over a set ofsiblings. (This optimization may be defeated by trashing when runningthreaded computation, it's unclear that this is a big deal in practice.)Predicate expressions are not compiled at this stage, they may be at run-timeif needed, but in this case they are compiled as full XPath expressions (theuse of some fixed predicate can probably be optimized, they are not yet).
The xsltCompMatch are then stored in the hash table, the clash list isitself sorted by priority of the template to implement "naturally" the XSLTpriority rules.
Associated to the compiled pattern is the xsltTemplate itself containingthe information required for the processing of the pattern including, ofcourse, a pointer to the list of elements used for building the patternresult.
Last but not least a number of patterns do not fit in the hash tablebecause they are not associated to a name, this is the case for patternsapplying to the root, any element, any attributes, text nodes, pi nodes, keysetc. Those are stored independently in the stylesheet structure as separatelinked lists of xsltCompMatch.
The processing is defined by the XSLT specification (the basis of thealgorithm is explained in the Introductionsection). Basically it works by taking the root of the input document andapplying the following algorithm:
The closure is usually done through the XSLTapply-templatesconstruct recursing by applying theadequate template on the input node children or on the result of anassociated XPath selection lookup.
Note that large parts of the input tree may not be processed by a givenstylesheet and that on the opposite some may be processed multiple times.(This often is the case when a Table of Contents is built).
The module transform.cis the one implementing most of
thislogic. xsltApplyStylesheet()is the entry point,
itallocates an xsltTransformContext containing the following:
Then a new document gets allocated (HTML or XML depending on the type ofoutput), the user parameters and global variables and parameters areevaluated. Then xsltProcessOneNode()which implements the1-2-3 algorithm is called on the root element of the input. Step 1/ isimplemented by calling xsltGetTemplate(), step 2/ isimplemented by xsltDefaultProcessOneNode()and step 3/ isimplemented by xsltApplyOneTemplate().
The XPath support is actually implemented in the libxml module (where itis reused by the XPointer implementation). XPath is a relatively classicexpression language. The only uncommon feature is that it is working on XMLtrees and hence has specific syntax and types to handle them.
XPath expressions are compiled using xmlXPathCompile().It will take an expression string in input and generate a structurecontaining the parsed expression tree, for example the expression:
/doc/chapter[title='Introduction']
will be compiled as
Compiled Expression : 10 elements
  SORT
    COLLECT  'child' 'name' 'node' chapter
      COLLECT  'child' 'name' 'node' doc
        ROOT
      PREDICATE
        SORT
          EQUAL =
            COLLECT  'child' 'name' 'node' title
              NODE
            ELEM Object is a string : Introduction
              COLLECT  'child' 'name' 'node' title
                NODE
This can be tested using the  testXPathcommand (in thelibxml
codebase) using the --treeoption.
Again, the KISS approach is used. No optimization is done. This could bean interesting thing to add. MichaelKay describesa lot of possible and interesting optimizations done inSaxon which would be possible at this level. I'm unsure they would providemuch gain since the expressions tends to be relatively simple in general andstylesheets are still hand generated. Optimizations at the interpretationsounds likely to be more efficient.
The interpreter is implemented by
xmlXPathCompiledEval()which is the front-end to
xmlXPathCompOpEval()the functionimplementing the evaluation
of the expression tree. This evaluation followsthe KISS approach again. It's
recursive and callsxmlXPathNodeCollectAndTest()to collect
nodes set whenevaluating a COLLECTnode.
An evaluation is done within the framework of an XPath context stored inan xmlXPathContextstructure, in the framework of atransformation the context is maintained within the XSLT context. Its contentfollows the requirements from the XPath specification:
For the purpose of XSLT an extrapointer has been addedallowing to retrieve the XSLT transformation context. When an XPathevaluation is about to be performed, an XPath parser context is allocatedcontaining and XPath object stack (this is actually an XPath evaluationcontext, this is a remain of the time where there was no separate parsing andevaluation phase in the XPath implementation). Here is an overview of the setof contexts associated to an XPath evaluation within an XSLTtransformation:

Clearly this is a bit too complex and confusing and should be refactoredat the next set of binary incompatible releases of libxml. For example thexmlXPathCtxt has a lot of unused parts and should probably be merged withxmlXPathParserCtxt.
An XPath expression manipulates XPath objects. XPath defines the defaulttypes boolean, numbers, strings and node sets. XSLT adds the result treefragment type which is basically an unmodifiable node set.
Implementation-wise, libxml follows again a KISS approach, thexmlXPathObject is a structure containing a type description and the variouspossibilities. (Using an enum could have gained some bytes.) In the case ofnode sets (or result tree fragments), it points to a separate xmlNodeSetobject which contains the list of pointers to the document nodes:

The XPath API(andits 'internal'part) includes a number of functions to create, copy, compare, convert orfree XPath objects.
All the XPath functions available to the interpreter are registered in thefunction hash table linked from the XPath context. They all share the samesignature:
void xmlXPathFunc (xmlXPathParserContextPtr ctxt, int nargs);
The first argument is the XPath interpretation context, holding theinterpretation stack. The second argument defines the number of objectspassed on the stack for the function to consume (last argument is on top ofthe stack).
Basically an XPath function does the following:
nargsfor proper handling of errors or functionswith
    variable numbers of parametersobj
    =valuePop(ctxt);valuePush(ctxt,res);xmlXPathFreeObject(obj);Sometime the work can be done directly by modifying in-situ the top
objecton the stack ctxt->value.
Not to be confused with XPath object stack, this stack holds the XSLTvariables and parameters as they are defined through the recursive calls ofcall-template, apply-templates and default templates. This is used to definethe scope of variables being called.
This part seems to be the most urgent attention right now, first it isdone in a very inefficient way since the location of the variables andparameters within the stylesheet tree is still done at run time (it reallyshould be done statically at compile time), and I am still unsure that myunderstanding of the template variables and parameter scope is actuallyright.
This part of the documentation is still to be written once this part ofthe code will be stable. TODO
There is a separate document explaining how theextension support works.
Michael Kay wrote areally interesting article on Saxon internalsand the work he did onperformance issues. I wishes I had read it before starting libxslt design (Iwould probably have avoided a few mistakes and progressed faster). A lot ofthe ideas in his papers should be implemented or at least tried inlibxslt.
The libxml documentation, especially the I/O interfacesand the memory management.
redesign the XSLT stack frame handling. Far too much work is done atexecution time. Similarly for the attribute value templates handling, atleast the embedded subexpressions ought to be precompiled.
Allow output to be saved to a SAX like output (this notion of SAX like APIfor output should be added directly to libxml).
Implement and test some of the optimization explained by Michael Kayespecially:
Error reporting, there is a lot of case where the XSLT specificationspecify that a given construct is an error are not checked adequately bylibxslt. Basically one should do a complete pass on the XSLT spec again andadd all tests to the stylesheet compilation. Using the DTD provided in theappendix and making direct checks using the libxml validation API sounds agood idea too (though one should take care of not raising errors forelements/attributes in different namespaces).
Double check all the places where the stylesheet compiled form might bemodified at run time (extra removal of blanks nodes, hint on thexsltCompMatch).
This document describes the work needed to write extensions to thestandard XSLT library for use with libxslt, the XSLTC library developed for the Gnomeproject.
Before starting reading this document it is highly recommended to getfamiliar with the libxslt internals.
Note: this documentation is by definition incomplete and I am not good atspelling, grammar, so patches and suggestions are really welcome.
The XSLT specificationprovidestwo ways to extend an XSLT engine:
In both cases the extensions need to be associated to a new namespace,i.e. an URI used as the name for the extension's namespace (there is no needto have a resource there for this to work).
libxslt provides a few extensions itself, either in the libxslt namespace"http://xmlsoft.org/XSLT/namespace" or in namespaces for other well knownextensions provided by other XSLT processors like Saxon, Xalan or XT.
Since extensions are bound to a namespace name, usually sets of extensionscoming from a given source are using the same namespace name defining inpractice a group of extensions providing elements, functions or both. Fromthe libxslt point of view those are considered as an "extension module", andmost of the APIs work at a module point of view.
Registration of new functions or elements are bound to the activation
ofthe module. This is currently done by declaring the namespace as an
extensionby using the attribute  extension-element-prefixeson
thexsl:stylesheetelement.
An extension module is defined by 3 objects:
Currently a libxslt module has to be compiled within the application usinglibxslt. There is no code to load dynamically shared libraries associated toa namespace (this may be added but is likely to become a portabilitynightmare).
The current way to register a module is to link the code implementing itwith the application and to call a registration function:
int xsltRegisterExtModule(const xmlChar *URI,
                          xsltExtInitFunction initFunc,
                          xsltExtShutdownFunction shutdownFunc);
The associated header is read by:
#include<libxslt/extensions.h>
which also defines the type for the initialization and shutdownfunctions
Once the module URI has been registered and if the XSLT processor detectsthat a given stylesheet needs the functionalities of an extended module, thisone is initialized.
The xsltExtInitFunction type defines the interface for an initializationfunction:
/**
 * xsltExtInitFunction:
 * @ctxt:  an XSLT transformation context
 * @URI:  the namespace URI for the extension
 *
 * A function called at initialization time of an XSLT
 * extension module
 *
 * Returns a pointer to the module specific data for this
 * transformation
 */
typedef void *(*xsltExtInitFunction)(xsltTransformContextPtr ctxt,
                                     const xmlChar *URI);
There are 3 things to notice:
What this function is expected to do is:
There is a single call to do this registration:
int xsltRegisterExtFunction(xsltTransformContextPtr ctxt,
                            const xmlChar *name,
                            const xmlChar *URI,
                            xmlXPathEvalFunc function);
The registration is bound to a single transformation instance referred byctxt, name is the UTF8 encoded name for the NCName of the function, and URIis the namespace name for the extension (no checking is done, a module couldregister functions or elements from a different namespace, but it is notrecommended).
The implementation of the function must have the signature of a libxmlXPath function:
/**
 * xmlXPathEvalFunc:
 * @ctxt: an XPath parser context
 * @nargs: the number of arguments passed to the function
 *
 * an XPath evaluation function, the parameters are on the
 * XPath context stack
 */
typedef void (*xmlXPathEvalFunc)(xmlXPathParserContextPtr ctxt,
                                 int nargs);
The context passed to an XPath function is not an XSLT context but an XPath context. However it is possible tofind one from the other:
xsltTransformContextPtr
         xsltXPathGetTransformContext
                          (xmlXPathParserContextPtr ctxt);
  xmlXPathContextPtrassociated to
    anxsltTransformContextis stored in the
    xpathCtxtfield.The first thing an extension function may want to do is to check
thearguments passed on the stack, the nargsparameter will tell
howmany of them were provided on the XPath expression. The macro valuePop
willextract them from the XPath stack:
#include <libxml/xpath.h> #include <libxml/xpathInternals.h> xmlXPathObjectPtr obj = valuePop(ctxt);
Note that ctxtis the XPath context not the XSLT one. It
isthen possible to examine the content of the value. Check the description of XPath
objectsifnecessary. The following is a common sequence checking whether
the argumentpassed is a string and converting it using the built-in
XPathstring()function if this is not the case:
if (obj->type != XPATH_STRING) {
    valuePush(ctxt, obj);
    xmlXPathStringFunction(ctxt, 1);
    obj = valuePop(ctxt);
}
Most common XPath functions are available directly at the C level and
areexported either in <libxml/xpath.h>or
in<libxml/xpathInternals.h>.
The extension function may also need to retrieve the data associated tothis module instance (the database connection in the previous example) thiscan be done using the xsltGetExtData:
void * xsltGetExtData(xsltTransformContextPtr ctxt,
                      const xmlChar *URI);
Again the URI to be provided is the one which was used when registeringthe module.
Once the function finishes, don't forget to:
valuePush(ctxt,obj)xmlXPathFreeObject(obj)The module libxslt/functions.c contains the sources of the XSLT built-infunctions, including document(), key(), generate-id(), etc. as well as a fullexample module at the end. Here is the test function implementation for thelibxslt:test function:
/**
 * xsltExtFunctionTest:
 * @ctxt:  the XPath Parser context
 * @nargs:  the number of arguments
 *
 * function libxslt:test() for testing the extensions support.
 */
static void
xsltExtFunctionTest(xmlXPathParserContextPtr ctxt, int nargs)
{
    xsltTransformContextPtr tctxt;
    void *data;
    tctxt = xsltXPathGetTransformContext(ctxt);
    if (tctxt == NULL) {
        xsltGenericError(xsltGenericErrorContext,
            "xsltExtFunctionTest: failed to get the transformation context\n");
        return;
    }
    data = xsltGetExtData(tctxt, (const xmlChar *) XSLT_DEFAULT_URL);
    if (data == NULL) {
        xsltGenericError(xsltGenericErrorContext,
            "xsltExtFunctionTest: failed to get module data\n");
        return;
    }
#ifdef WITH_XSLT_DEBUG_FUNCTION
    xsltGenericDebug(xsltGenericDebugContext,
                     "libxslt:test() called with %d args\n", nargs);
#endif
}
There is a single call to do this registration:
int xsltRegisterExtElement(xsltTransformContextPtr ctxt,
                           const xmlChar *name,
                           const xmlChar *URI,
                           xsltTransformFunction function);
It is similar to the mechanism used to register an extension function,except that the signature of an extension element implementation isdifferent.
The registration is bound to a single transformation instance referred toby ctxt, name is the UTF8 encoded name for the NCName of the element, and URIis the namespace name for the extension (no checking is done, a module couldregister elements for a different namespace, but it is not recommended).
The implementation of the element must have the signature of an XSLTtransformation function:
/** 
 * xsltTransformFunction: 
 * @ctxt: the XSLT transformation context
 * @node: the input node
 * @inst: the stylesheet node 
 * @comp: the compiled information from the stylesheet 
 * 
 * signature of the function associated to elements part of the
 * stylesheet language like xsl:if or xsl:apply-templates.
 */ 
typedef void (*xsltTransformFunction)
                          (xsltTransformContextPtr ctxt,
                           xmlNodePtr node,
                           xmlNodePtr inst,
                           xsltStylePreCompPtr comp);
The first argument is the XSLT transformation context. The second andthird
arguments are xmlNodePtr i.e. internal memory representation of  XML nodes. They
arerespectively nodefrom the the input document being
transformedby the stylesheet and instthe extension element in
thestylesheet. The last argument is compa pointer to a
precompiledrepresentation of instbut usually for an extension
functionthis value is NULLby default (it could be added and
associatedto the instruction in inst->_private).
The same functions are available from a function implementing an
extensionelement as in an extension function,
includingxsltGetExtData().
The goal of an extension element being usually to enrich the generatedoutput, it is expected that they will grow the currently generated outputtree. This can be done by grabbing ctxt->insert which is the currentlibxml node being generated (Note this can also be the intermediate valuetree being built for example to initialize a variable, the processing shouldbe similar). The functions for libxml tree manipulation from <libxml/tree.h>canbe employed to extend or modify the tree, but it is required to preserve theinsertion node and its ancestors since there are existing pointers to thoseelements still in use in the XSLT template execution stack.
The module libxslt/transform.c contains the sources of the XSLT built-inelements, including xsl:element, xsl:attribute, xsl:if, etc. There is a smallbut full example in functions.c providing the implementation for thelibxslt:test element, it will output a comment in the result tree:
/**
 * xsltExtElementTest:
 * @ctxt:  an XSLT processing context
 * @node:  The current node
 * @inst:  the instruction in the stylesheet
 * @comp:  precomputed informations
 *
 * Process a libxslt:test node
 */
static void
xsltExtElementTest(xsltTransformContextPtr ctxt, xmlNodePtr node,
                   xmlNodePtr inst,
                   xsltStylePreCompPtr comp)
{
    xmlNodePtr comment;
    if (ctxt == NULL) {
        xsltGenericError(xsltGenericErrorContext,
                         "xsltExtElementTest: no transformation context\n");
        return;
    }
    if (node == NULL) {
        xsltGenericError(xsltGenericErrorContext,
                         "xsltExtElementTest: no current node\n");
        return;
    }
    if (inst == NULL) {
        xsltGenericError(xsltGenericErrorContext,
                         "xsltExtElementTest: no instruction\n");
        return;
    }
    if (ctxt->insert == NULL) {
        xsltGenericError(xsltGenericErrorContext,
                         "xsltExtElementTest: no insertion point\n");
        return;
    }
    comment =
        xmlNewComment((const xmlChar *)
                      "libxslt:test element test worked");
    xmlAddChild(ctxt->insert, comment);
}
When the XSLT processor ends a transformation, the shutdown function (ifit exists) for each of the modules initialized is called. ThexsltExtShutdownFunction type defines the interface for a shutdownfunction:
/**
 * xsltExtShutdownFunction:
 * @ctxt:  an XSLT transformation context
 * @URI:  the namespace URI for the extension
 * @data:  the data associated to this module
 *
 * A function called at shutdown time of an XSLT extension module
 */
typedef void (*xsltExtShutdownFunction) (xsltTransformContextPtr ctxt,
                                         const xmlChar *URI,
                                         void *data);
This is really similar to a module initialization function except a thirdargument is passed, it's the value that was returned by the initializationfunction. This allows the routine to deallocate resources from the module forexample close the connection to the database to keep the same example.
Well, some of the pieces missing: