markdownxquery中递归的介绍(代码片段)

author author     2022-12-16     404

关键词:

## An introduction to recursion in XQuery

- Created: Nov 28, 2017
- Updated: Nov 29, 2017: Now covers transformation of XML documents

Recursion is a powerful programming technique, but the idea is simple: instead of performing a single operation, a function calls *itself* repeatedly to whittle through a larger task. In XQuery, recursion can be used to accomplish complex tasks on data that a plain FLWOR expression (which iterates through a sequence) cannot, such as transforming an entire XML document from one format into another, like TEI or DocBook into HTML, EPUB, LaTeX, or XSL-FO. Transforming a document is well-suited to recursion because each of the document's nodes may need to be examined and manipulated based on the node's type, name, and location in the document; and once a node has been processed, the transformation must continue processing the nodes' children and descendants until the deepest leaf node has been processed. But learning the technique of recursion is often hard for a beginning programmer, perhaps because tutorials jump directly to complex operations such as document transformation. So let's start with a simple illustration of the technique of recursion in XQuery and build up to document transformation.

To illustrate basic recursion in XQuery, here is a simple function, `local:countdown()`, which takes a number, returns that number, and then calls itself with a new input (the original number minus one), until the input is 1, at which point it returns the words "lift off!" and ends:

```xquery
xquery version "1.0";

declare function local:countdown($n as xs:integer) 
    $n, 
    if ($n eq 1) then 
        "lift off!" 
    else 
        local:countdown($n - 1)
;

local:countdown(5)
```

The query returns the sequence, `(5, 4, 3, 2, 1, "lift off!")`.

We could also have accomplished the same result with a FLWOR expression:

```xquery
xquery version "1.0";

let $start := 5
for $n in reverse(1 to $start)
return
    (
        $n, 
        if ($n eq 1) then 
            "lift off" 
        else 
            ()
    )
```

This query returns the identical sequence as the recursive query above, and it is a perfectly valid, even preferable, approach to the challenge of a simple countdown. Understanding how these different approaches arrived at the identical result is a critical first step to understanding recursion. To guide your thinking, consider this: The FLWOR expression is the XQuery language's powerful and built-in technique for iterating through sequences. In contrast, recursion, which is a technique rather than a built-in expression, opens the door to processing more complex structures than flat sequences. The FLWOR expression may meet all your needs. But if you ever find yourself struggling with the limitations of the sequence-oriented design of the FLWOR expression, you may find that recursion is what you need to accomplish your goal.

Before demonstrating how to apply recursion to document transformation challenges, let's reinforce the concepts above by exploring new facilities in XQuery 3.0 that improve the expressiveness and convenience of using recursion in your code.

With XQuery 3.0, XQuery programmers are no longer limited to "named functions", like the one above, but can now also use [inline function expressions](https://www.w3.org/TR/xquery/#id-inline-func), which are "anonymous" functions defined directly in an expression rather than a traditional, standalone, named function that can be defined only in a query prolog. Inline functions are useful when a function only has local relevance and doesn't need to be invoked from other contexts. They are also useful for [higher-order functions](https://www.w3.org/TR/xquery/#id-higher-order-function-feature), functions that themselves accept function items as a parameter. 

Let's apply these ideas to our simple recursive countdown query from earlier. Adapting a named function to be an inline function can require some minor but important changes to a function's signature when the function calls itself recursively. In the case of our function, you might think you could use the same function signature, with the single parameter `$n as xs:integer`. For example:

```xquery
xquery version "3.0";

let $countdown := 
    function($n as xs:integer)  
        if ($n eq 1) then 
            "lift off!" 
        else 
            local:countdown($n - 1)
    
return
    $countdown(5)
```

However, this query fails with the following error:

> err:XPDY0002 variable '$countdown' is not set. [at line 7, column 9]

To avoid this error, we need to add an additional parameter to the inline function, which itself is a function. Then we can adapt the function call in the function body to use this parameter instead of the "not set" `$countdown` variable. Here is the result of these changes:

```xquery
xquery version "3.0";

let $countdown := 
    function($n as xs:integer, $f as function(*))  
        $n, 
        if ($n eq 1) then 
            "lift off!" 
        else
            $f($n - 1, $f) 
    
return
    $countdown(5, $countdown)
```

At first glance this workaround makes the use of recursion with inline function expressions a bit more cumbersome, but at least it works! This workaround is a pedagogical excuse to illustrate of the "higher-order function" feature of XQuery 3.0 mentioned above, as well as a particular kind of use of higher-order functions, called "callback functions." 

Callback functions are functions that are passed as a parameter to a function, which the receiving function "calls back" to with its results. For example, you could design function A to accept a callback function as a parameter, to which it will send its results. Now, when you call function A, you will pass function B to it as this parameter. Function A will then send its results to function B. 

Adapting this idea to our examples above, we can rework our "countdown" function to be a little more generic. Instead of hard-coding the "lift off!" announcement in this function, we can adapt the function to accept a *third* parameter, `$announce`, which will take an annonymous function that we will use as a callback function, i.e., calling it when the counting job is done:

```xquery
xquery version "3.0";

let $countdown := 
    function($n as xs:integer, $do as function(*), $announce as function(*))  
        $n, 
        if ($n eq 1) then 
            $announce() 
        else
            $do($n - 1, $do, $announce) 
    
let $announce := function()  "lift off at " || current-time() || "!" 
return
    $countdown(5, $countdown, $announce)
```

The query returns the sequence, `(5, 4, 3, 2, 1, "lift off at 13:00:25.779-05:00!")`. With this change, our "countdown" function now focuses on its core task of counting down, and the task of "announcing" the lift off is delegated to a separate function. 

We have one more variation to introduce. What if, instead of defining a new custom "announce" function, we wanted our "announce" function to call a built-in function, or a named function defined in the same module or other library module? To do this we can refer to the built-in or already-defined function using a [named function reference](https://www.w3.org/TR/xquery/#id-named-function-ref). For example, we can refer to the `fn:current-time()` function using the syntax `current-time#0`. Here, the `#0` indicates that we are referring to the version of this function that takes zero parameters. (Every function takes a certain number of arguments; this number is known as the function's [arity](https://www.w3.org/TR/xquery/#GLdt-argumentlist-arity).) Changing the value of `$announce` to this named function reference will mean that our query's final item will just be the current time. Between inline functions and named function references, you now know everything you need to be able to pass functions around as arguments in your XQuery code. 

Let's leverage all of the techniques here to transform an XML document from one format to another (HTML to TEI), not only renaming elements, but also enriching the content in the process. We will perform the transformation with a function, `local:transform()` that takes two parameters: node(s) from the source document and a custom function for enriching text nodes. The heart of this function will use the XQuery FLWOR and [typeswitch](https://www.w3.org/TR/xquery/#id-typeswitch) expressions to recusively step through the document structure, evaluate each input node by type and name, and apply the following rules:

1. Rename certain HTML elements: Turn `<html:h1>` (and `<html:h2>`, etc.) elements into `<tei:head>` elements, stripping off attributes and recursively passing all child nodes back to the `local:transform()` function.
2. Reuse certain HTML element names: Turn other "known" HTML elements into a TEI element of the same name, stripping off attributes and recursively passing all child nodes back to the `local:transform()` function.
3. For any other "unknown" elements from the source document, preserve the original element names and attributes, recursively passing all child nodes back to the `local:transform()` function.
4. Enrich text nodes by applying a custom function.
5. Let all other types of nodes through unchanged.

We will define our function for enriching text nodes as an inline function, using the `fn:analyze-string()` function to find substrings that match a simple regular expression date pattern and wrapping the matches with a `<tei:date>` element. Lastly, we will deploy a named function reference by referring to a verbosely named function (`local:make-tei-element-name()`) using a shorthand variable (`$tei`). 

Here is the full query that accomplishes what we have described:

```xquery
xquery version "3.1";

declare namespace html = "http://www.w3.org/1999/xhtml";
declare namespace tei = "http://www.tei-c.org/ns/1.0";

(:~ A convenience function for creating element names in the TEI namespace :)
declare function local:make-tei-element-name($local-name as xs:string) 
    QName("http://www.tei-c.org/ns/1.0", $local-name)
;

(:~ A skeleton function for recursively transforming HTML into TEI :) 
declare function local:transform($nodes as node()*, $enrich-text as function(*)) 
    let $tei := local:make-tei-element-name#1
    for $node in $nodes
    return
        typeswitch ($node)
            (: html:h* => tei:head :)
            case element(html:h1) | element(html:h2) | element(html:h3) return 
                element 
                     $tei("head")  
                     local:transform($node/node(), $enrich-text) 
            (: html:div => tei:div, html:p => tei:p :)
            case element(html:div) | element(html:p) return 
                element 
                     $tei($node/name())  
                     local:transform($node/node(), $enrich-text) 
            (: leave unknown elements intact, including attributes :)
            case element() return
                element 
                     node-name($node)  
                     $node/@*, local:transform($node/node(), $enrich-text) 
            case text() return
                $enrich-text($node)
            default return
                $node
;

let $input := 
    <div xmlns="http://www.w3.org/1999/xhtml">
        <h1>Introduction</h1>
        <p class="fact">Pearl Harbor was attacked on December 7, 1941.</p>
        <video width="320" height="240">
            <source src="movie.mp4" type="video/mp4"/>
        </video>
        <!-- to be continued -->
    </div>
let $enrich-text := 
    (: An inline function for enriching text :)
    function($text as text()) 
        let $analysis := analyze-string($text, "\w+ \d1,2, \d4")
        for $result in $analysis/*
        return
            if ($result instance of element(fn:match)) then
                element
                     local:make-tei-element-name("date")  
                     $result/string() 
            else
                $result/string()
    
return
    local:transform($input, $enrich-text)
```

This query returns the HTML document as a TEI document, with the date string wrapped in a TEI `<date>` element and the comment node and "unexpected" HTML elements intact:

```xml
<div xmlns="http://www.tei-c.org/ns/1.0">
    <head>Introduction</head>
    <p>Pearl Harbor was attacked on <date>December 7, 1941</date>.</p>
    <video xmlns="http://www.w3.org/1999/xhtml" width="320" height="240">
        <source src="movie.mp4" type="video/mp4"/>
    </video>
    <!-- to be continued -->
</div>
```

Take some time to unpack the logic and flow of the transformation here, making note of how we combined recursion with FLWOR and typeswitch expressions, as well as named, inline, and higher-order functions, to perform targeted operations on different parts of the source document. You may well be able to use a subset of these techniques for more straightforward transformations (for example, transforming elements but omitting text enrichment, or only performing text enrichment while keeping the vocabulary intact), but you can also build on these patterns to do more than is shown here.

Learning to use recursion and these other techniques takes some practice, so be on the look out for opportunities to use them. Before long, you'll find yourself reaching for them to help you manipulate and transform data and get work done.

markdownxquery资源(代码片段)

查看详情

递归的入门介绍(代码片段)

这是一只海龟,但小程要介绍的是递归。递归是一种很重要的结构与设计思想。之所以重要,是因为读者经常会遇到递归结构,或递归算法。而为了解决某些问题,递归的设计思想有时很有效果。本文介绍递归的设计与实现。小... 查看详情

二叉树的创建与遍历(递归实现)(代码片段)

...语总结一文中介绍了二叉树的基本结构。在不知道怎样用递归?按步骤来!一文中介绍了如何使用递归。二叉树的结构是递归的,所以创建、遍历也可以通过递归实现。下面是一颗二叉树:结点的定义:publicclassNodeIntegervalue;Nodel... 查看详情

二叉树的遍历(代码片段)

二叉树的遍历介绍算法实现先序遍历递归先序遍历非递归先序遍历中序遍历递归二叉树遍历非递归中序遍历后序遍历递归后序遍历非递归后序遍历层次遍历后续介绍二叉树的遍历可以说是二叉树最重要的一个内容,如果想对... 查看详情

二叉树的遍历(代码片段)

二叉树的遍历介绍算法实现先序遍历递归先序遍历非递归先序遍历中序遍历递归二叉树遍历非递归中序遍历后序遍历递归后序遍历非递归后序遍历层次遍历后续介绍二叉树的遍历可以说是二叉树最重要的一个内容,如果想对... 查看详情

二叉树的遍历(代码片段)

二叉树的遍历介绍算法实现先序遍历递归先序遍历非递归先序遍历中序遍历递归二叉树遍历非递归中序遍历后序遍历递归后序遍历非递归后序遍历层次遍历后续介绍二叉树的遍历可以说是二叉树最重要的一个内容,如果想对... 查看详情

vue两种组件类型介绍:递归组件和动态组件(代码片段)

一递归组件  递归组件的特性就是可以在自己的template模板中调用自己本身。值得注意的它必须设置name属性。  //递归组件recursive.vue<template><div><p>递归组件</p><Recursion:count="count+1"v-if="count<3">... 查看详情

递归就这么简单(代码片段)

递归介绍本来预算此章节是继续写快速排序的,然而编写快速排序往往是递归来写的,并且递归可能不是那么好理解,于是就有了这篇文章。在上面提到了递归这么一个词,递归在程序语言中简单的理解是:方法自己调用自己递... 查看详情

运用递归解决二叉树相关问题(代码片段)

...经介绍了如何解决树的遍历问题。我们也已经尝试过使用递归解决树的为前序遍历、中序遍历和后序遍历问题。事实上,递归是解决树相关问题的最有效和最常用的方法之一。本节中,我们将会介绍两种典型的递归方法... 查看详情

分治法(divideandconquer)详解(代码片段)

文章目录引言分治法的范式递归式求解递归式的三种方法代入法递归树法主方法引言在这篇blog中,我首先会介绍一下分治法的范式,接着给出它的递归式通式,最后我会介绍三种方法(代入法,递归树,... 查看详情

冒泡排序(代码片段)

...的工作中使用这些算法的地方非常少,感觉就偶尔用一下递归,而且递归也不是什么算法,其他的很多算法很少使用,但是既然人家问了,那也没办法,说不定是因为人家的公司确实高大上,工作中很多的场景可以使用这些算法... 查看详情

算法总结之递推与递归(代码片段)

递推算法递归算法大致包括两方面的内容:1)递归起点;2)递归关系递推起点递归起点一般由题目或者实际情况确定,不由递归关系推出。如果无法确定递归起点,那么递归算法就无法实现。可见,递归起点是递归算法中的重... 查看详情

07-08函数递归(代码片段)

[TOC]一函数递归调用介绍函数不仅可以嵌套定义,还可以嵌套调用,即在调用一个函数的过程中,函数内部又调用另一个函数,而函数的递归调用指的是在调用一个函数的过程中又直接或间接地调用该函数本身插图:恶搞图58例... 查看详情

最新情报:所有的递归都可以改写成非递归?(代码片段)

...查看。在上一节的最后,彤哥收到最新情报,说是所有的递归都可以改写成非递归,是不是真的呢?如何实现呢?有没有套路呢?让我们带着这些问题进 查看详情

最新情报:所有的递归都可以改写成非递归?(代码片段)

...查看。在上一节的最后,彤哥收到最新情报,说是所有的递归都可以改写成非递归,是不是真的呢?如何实现呢?有没有套路呢?让我们带着这些问题进 查看详情

栈和队列的简单介绍(代码片段)

1.栈1.1常用方法1.2栈的应用场景改变元素的序列将递归转化为循环比如:逆序打印链表//递归方式voidprintList(Nodenode) if(null!=node) printList(node.next); System.out.print(node.val+""); //循环方式voidprintList(Nodenode) if(n 查看详情

栈和队列的简单介绍(代码片段)

1.栈1.1常用方法1.2栈的应用场景改变元素的序列将递归转化为循环比如:逆序打印链表//递归方式voidprintList(Nodenode) if(null!=node) printList(node.next); System.out.print(node.val+""); //循环方式voidprintList(Nodenode) if(n 查看详情

c递归详解(通俗易懂)(代码片段)

...、定义    1.概述    2.条件    3.比较二、如何理解递归?    1.函数调用其他函数示例 :     2.函数调用函数自身示例:         3.函数调用自身的底层操作:         ①在主调函数调用被调函数之前——  ... 查看详情