Dimitre's profileXSLT: Riding the challen...BlogListsNetwork Tools Help

Blog


    September 08

    Part 3: The Swiss Army Knife of Data Structures ... in C#

    In Part 1 of these series I described an implementation of the Finger Tree data structure in C#.

    Part 2 reported the results of performance tests with the standard .NET Enumerable class and the finger-tree-based sequence

    The new data provided here is the results of similar performance tests, this time between the finger-tree-based sequence and XPath 2.0 fuctions on sequences as implemented by a group of three XSLT 2.0 processors.

    I apologize for not specifying the names of the XSLT 2.0 processors used in this experiment. This was done intentionally, since the purpose of the experiment was only to compare the current efficiency of the XPath sequences and strings as implemented by these XSLT processors to the efficiency of a Finger Tree - based implementation. This experiment did not have the goal of comparing the XSLT processors with each other. The results were consistent with and did not alter my prior perception of each XSLT processor's efficiency.

    Also, the usual warning:

    Disclaimer: This comparison was carried out on my 4-year old PC, having CPU speed of 3GHz and RAM of 2GB and running Windows XP.  Results will largely vary on other configurations, depending on CPU and memory speed, available memory, cache size, ..., etc. Another factor to consider is the time necessary for jitting the .NET IL code on initial method execution or the JVM startup time. However the observed ratios (speedup) should remain closely the same.

    The following XPath 2.0 functions on sequences were tested:

    fn:insert-before()

    op:concatenate()

    fn:remove()

    fn:subsequence()

    op:item-access()  (the position() function with integer argument)

    fn:reverse()

    The following XPath 2.0 functions on strings were tested:

    fn:string-length()

    fn:concat()     (with 2 arguments)

    fn:substring()

    The finger-tree-based sequence was implemented in such a way that an instance of it was instantiated as an extension object of the Saxon 9.1 (for .NET) XSLT processor and then the methods were referenced as extension functions.

    Any of these extension functions was executed 100 000 times and the average time was recorded. Unfortunately, two of the three tested XSLT processors were very slow, which necessitated that during the tests of these XSLT processors many functions had to be executed 10 000 times or even only 1000 or, in a very few, exceptional circumstances, only 100 times.

    Writing such code can be tricky, especially due to the very aggressive optimization performed by one of the tested XSLT processors.  Thus, in order to obscure from such a clever XSLT processor the fact that a variable had been defined outside of an <xsl:for-each> instruction, something like the following code had to be used:

    <xsl:for-each select="1 to 100000">

      <xsl:variable name="vFSeq3"
        select="if(. mod 2 eq 0)
                       
    then $vFSeq1
                       
    else $vFSeq2
                     
    "/>

      <xsl:variable name="vRem" as="xs:integer"
                             select=". mod 10"/>

      <xsl:variable name="vFSeqY"
         select="if($vRem eq 1)
                       
    then $vFSeqY1
                     
    else if($vRem eq 2)
                       
    then $vFSeqY2
                     
    else if($vRem eq 3)
                       
    then $vFSeqY3
                     
    else if($vRem eq 4)
                       
    then $vFSeqY4
                     
    else if($vRem eq 5)
                       
    then $vFSeqY5
                     
    else if($vRem eq 6)
                       
    then $vFSeqY6
                     
    else if($vRem eq 7)
                       
    then $vFSeqY7
                     
    else if($vRem eq 8)
                       
    then $vFSeqY8
                     
    else if($vRem eq 9)
                       
    then $vFSeqY9
                     
    else $vFSeqY10
    "/>

      <xsl:variable name="vFSeq5" select=
                           "ext:Merge($vFSeq3, $vFSeqY)"/>

      <xsl:variable name="vLen" select=
                            "number(ext:Length($vFSeq5))"/>

      <xsl:value-of select="(1,2,3)[$vLen]"/>
    </xsl:for-each>

    Note how the last instruction within the <xsl:for-each> above causes a lazy-evaluating XSLT processor to finish its work and produce the complete sequence.

    The sequences (and character strings) on which the above functions were performed had length of 10 000.

    The code for these tests is here.

    As in the .NET comparison test (reported in Part 2), there were two approaches:

    • Testing "the worst or extreme case", when the argument to be inserted, concatenated, removed, used as index, ... was a value equal or very close to 10000.
    • Testing "the average case": with many different values for this argument, uniformly increasing in the interval [1000, 10000].

    As can be seen from these tables (my weblog host allows only limited length for an entry and could not accommodate them inline), which summarize the results of the "uniform" or "average" testing of the sequence implementations, using a Finger Tree as a general sequence data structure in XPath 2.0 may speed up sequence operations (roughly) 5 to 40 times even when only the best of the three XSLT processors' results are compared, 5 to 4500 times if all results are taken into account.

    The XPath string implementation of the three XSLT 2.0 processors was tested against a finger-tree based string object and the results from the "average" tests are summarized in these tables.

    As can be seen, using a Finger Tree to represent a string in XPath 2.0 may speed up string operations (roughly) 3 to 9 times even when only the best of the three XSLT processors' results are compared, 3 to 40 times if all results are taken into account.

    I will conclude the series on finger trees with the following, quite

    Obvious Conclusion:
    Existing sequence and string implementations in .NET and in XSLT 2.0 processors can be speeded up significantly by using the Finger Tree data structure. Additionally, .NET will benefit from having a truly immutable sequence implementation.

    Certainly, the question is not "if" such implementations will be produced, but only "when".

    Comments (7)

    Please wait...
    Sorry, the comment you entered is too long. Please shorten it.
    You didn't enter anything. Please try again.
    Sorry, we can't add your comment right now. Please try again later.
    To add a comment, you need permission from your parent. Ask for permission
    Your parent has turned off comments.
    Sorry, we can't delete your comment right now. Please try again later.
    You've exceeded the maximum number of comments that can be left in one day. Please try again in 24 hours.
    Your account has had the ability to leave comments disabled because our systems indicate that you may be spamming other users. If you believe that your account has been disabled in error please contact Windows Live support.
    Complete the security check below to finish leaving your comment.
    The characters you type in the security check must match the characters in the picture or audio.

    To add a comment, sign in with your Windows Live ID (if you use Hotmail, Messenger, or Xbox LIVE, you have a Windows Live ID). Sign in


    Don't have a Windows Live ID? Sign up

    123 haowrote:
    <a href="http://www.rmt-wm.com">RMT-WM</a> <a href="http://rmtvip.jp">rmtvip</a>The baby eagle liked the nest. 「<a href="http://www.rmt-wm.com/games/aion.html">-aion rmt</a>」It was the only world he had ever known. <a href="http://www.rmtvip.jp/game/aion.html">aion RMT</a> It was warm and comfortable, had a great view, 「<a href="http://www.rmt-wm.com/games/0003.html"> リネージュ2 rmt</a> and even better, he had all the food and love and attention that a great mother eagle could provide. <a href="http://www.rmtvip.jp/game/Lineage2_リネージュ2_RMT.html">リネージュ2 RMT</a> Many times each day the mother would swoop down from the sky and land in the nest and feed the baby eagle delicious morsels of food. She was like a god to him, he had no where she came from or how she worked her magic. <a href="http://www.rmt-wm.com/games/A004.html">アラド戦記 rmt</a>The baby eagle was hungry all the time, but the mother eagle would always come just in time with the food and love and attention he craved. The baby eagle grew strong. <a href="http://www.rmtvip.jp/game/アラド戦記_RMT.html">アラド戦記 RMT</a>His vision grew very sharp. He felt good all the time. <a href="http://www.rmt-wm.com/games/0005.html">ff11 rmt</a>Until one day, the mother stopped coming to the nest. <a href="http://www.rmtvip.jp/game/FF11_FFXI_RMT.html">FF11 RMT</a>Thelooked up at the mother and cried "Why did you abandon me? I'm going to die any minute. How could you do faster. He screamed. "I'm dying I'm dying," he cried. He picked up more speed. <a href="http://www.replica-watches-mall.com">Replica Watche</a>
    He looked up at his mother. "How could you do this to me?" He looked down. Something strange happens.
    Sept. 27
    No namewrote:
    您需要二手液晶显示屏废旧液晶屏么?我们是不折不扣的二手液晶屏、旧液晶屏大批发商,长期大量供应可再利用的旧液晶屏。我公司提供的各种尺寸的二手液晶屏, 不同厚薄如笔记本屏,均已经过我们严格的分类,检验,测试流程。请访问协力液晶屏www.sceondhandlcd.com[bigeedjjijdgjegi]
    Nov. 25
    Oct. 30
    Oct. 30
    No namewrote:

    Hi,Do you need advertising displays, digital signages, advertising player and LCD displays? Please go Here:www.amberdigital.com.hk(Amberdigital).we have explored and developed the international market with professionalism. We have built a widespread marketing network, and set up a capable management team dedicated to provide beyond-expectation services to our customers.

    amberdigital Contact Us

    website:www.amberdigital.com.hk
    alibaba:amberdigital.en.alibaba.com[jfedhhdhfdiecf]

    Oct. 15
    No namewrote:

    Hi,Do you need advertising displays, digital signages, mp4 advertisement players, SD card players and advertisement LCD displays? Please go Here:www.amberdigital.com.hk(Amberdigital).we have explored and developed the international market with professionalism. We have built a widespread marketing network, and set up a capable management team dedicated to provide beyond-expectation services to our customers.

    amberdigital Contact Us

    website:www.amberdigital.com.hk
    alibaba:amberdigital.en.alibaba.com[dejeicbdbiijbc]

    Sept. 22
    No namewrote:
    I'm afraid I would have to do a lot of background work to equip myself to understand the fingertree paper cited.

    But my main concern about the approach as applied to XSLT is that Saxon is designed with the primary objective that sequences should not be materialized in memory unless absolutely essential. Operations like subsequence(), insert-before(), sequence concatenation, and so on are therefore implemented as iterators applied to a base iterator. The only requirement on a data structure used to store sequence is therefore that iterating over its members should be fast. (That's not quite true; finding the Nth item can also be significant). A fingertree might be able to perform all sorts of clever tricks very rapidly, but this would only be of any use if (a) sequences are materialized often enough for this to make an impact, and (b) the operations like subsequence() are made polymorphic - that is, they are implemented differently depending on what kind of input data structure they are processing. Very often the input sequence is actually an axis in an XML tree. It's therefore not clear to me that defining a better data structure for those sequences that need to be materialized in memory is a fruitful area for achieving XSLT speedup.

    But perhaps I've missed something?
    Sept. 20

    Trackbacks

    The trackback URL for this entry is:
    http://dnovatchev.spaces.live.com/blog/cns!44B0A32C2CCF7488!672.trak
    Weblogs that reference this entry
    • None